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NOVEL AMPHIPHILIC NUCLEIC ACID CONJUGATES 

i 

INTRODUCTION 

Technical Field 

The subject invention relates to specific 
polynucleotide binding polymers conjugated to 
solubility modifying moieties for inhibition of 
expression. 

Background 

There is a continuing interest and need for 
agents capable of modulating intracellular expression. 
The agents could have a profound capability of solving 
a variety of genetically associated problems. These 
agents, particularly complementary nucleic acid agents, 
could be used as antiviral agents to inhibit the ex- 
pression of viral essential genes. The agents also 
could act as anti-neoplastic agents, reducing the rate 
of proliferation of cancer cells or inhibiting their 
growth entirely. These agents would act intracellular- 
ly binding to transcription products by a mechanism or 
mechanisms unknown, to inhibit the expression of a par- 
ticular structural gene. 

There has been substantial interest in this 
possibility and a number of experiments in culture have 
shown that there may be some promise to this approach. 
However, there are also numerous short-comings to the 
approaches that have been used previously. In order to 
provide for a useful agent for therapy, the agent 
should be effective at low concentrations, so as to 
allow for relatively low dosages when administered sys- 
temically. Secondly, agents should be relatively 
stable and resistant to degradation by the various nu- 
cleases. Thirdly, the agent should be very rapid once 
introduced into the cytoplasm and highly specific in 
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binding to its complementary sequence, 50 as to avoid 
long incubation periods. Fourth, the agent should be 
able to penetrate the membrane. The agent should be 
effective at low concentrations to avoid high concen- 
tratx 0ns in the blood stream. Finally, adverse effects 
to the mammalian host should be minimized and the oli- 
gonucleotide agent should provide for a minimal immuno- 
genxc response. While various of these criteria may be 

10 b eT° mi ! ed t0 diff6renfc degreeS ' thS a * enfcs fc-e 
been produced so far fall far short of agents which 
might find general use. 

Relevant Literature 

15 ." 3e ° f relatl ' e ^ sbort P»b« to maximize 

selectivity while retaining high sensitivity to single 
base alsmatohes is suggested by Szostak, et al 
. Methods Enzymo! (, 97 9) 68,419-429, Wu, MaTurT^ew 

20 ' ! 5 80> „^ !,,,0,! H0 ^' i^ioi^he,. (, 97771^7472- 

76 1770 u;^-^ "°1~979) 
11-1770 1774; Agarwal, et al., J. Biol. <»,.„ 

i56 = 1023-,028. T.lU., a£ a.. ^I^r^," , I. . 

^.(1980)^=941, O^neta l-.^^J^""- 
( 1 983 ) li: 775 , Conner et al . , Pro o . Natl. Acad Sol 
5 ^ ( ' 98 3> 10 = 278, Piratsu et al., „e* E, g . \ ~ 
<1983, 309=284-237, WallaceTt^ . .^T^*^ 

There hare been a number-of reports on The use 
of specific nucleic sequences to inhibit viral replicl- 

3° »atl' a!!! 'l'*™' 1 *- and Stephenson. Proc. 

f A ° aJ - (1«8) 71.280-284; Tullis et"~a7~ 

_J- Cellular Blo^m «„ rr1 _ (,984) 8A=58 (AbstraltTT" 
Nuc^d^e^ (1985) 77=49,1= K ald 'et 

Utn 3 ^ ) ^i= 569 - 57 '= ^ecni.et al . . Pr7c. 
35 1- Acad. Scl.. ns» f.oa^ a^ .n.ij , ne _ — 

Modified nucleic acids, such as triesters and 
inhibiting expression. Miller et al. , Biochemistry 
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(1974) 11:4887-4895; Barrett et al . , _ibid. ( 197 io 
I3.:4897-4906; Miller et al.. Ibid. (1977) l£: 1 988-1 997 ; 
Miller et al . , Biochemistry (1981) 20.: 1 873- 1 880 ; Blake' 
g It al., Biochemistry (1985a, b) 24.:61 3 2 and 6134; Smith 
— ~ ' ? roc ' Wat'l. Acad. Soi. ns* (1986) 83:2787-91 . 
Agris et al., Biochemistry (1986) 25:6268-6275; Miller 
et al. , Biochemistry ( 1986) 25j 50 92-5097 . 

Modified nucleic acid sequences for enhancing 
binding to the complementary sequence are reported by 
Vlassov et al . , Adv. Eng. Reg. 1986) : 3 01 - 3 20; Summerton 

The0r - Blol v < 1 979) 78:77-99; Knorre (1986) Ady 
E "S- Reg. 1986:277- 3 00. 

Reduced immunogenicity of proteins conjugated 
to polyethyleneglycol is report by Tomasi and Fallow 
TO86/0.NJ.5 (PCT/U585/02572) and Abu chows k i et al. ' 
Cancer Biochem. Blophys. (i 9 84) 1:175-186. See also 
U.S. Patent N03. 4,511,713 and 4,587,044. 

20 SUMMARY O F THE INVEttTTOM 

Novel nucleic' acid conjugates are provided 
comprising a relatively short nucleic acid sequence 
complementary to a sequence of interest for modifying 
intracellular expression, a linking group, and a group 
^ which imparts amphiphilic character to the final 
product, usually more hydrophobic than hydorphilic 
where hydrophobic includes amphiphilic. The nucleic 
acid moiety may include normal or other sugars, phos- 
phate groups or modified phosphate groups or bases 
3o other than the normal bases where the modifications do 
not interfere with complementary binding of the 
sequence of interest. The compositions find use for 
inhibiting mRNA maturation and/or expression of 
particular structural genes, such as in neoplastic 
^ cells, of viral proteins in viral infected cells, and 
essential protein(s) of human and animal pathogens. 
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DESCRIPTION OF THE SPEHTFXC EMBQDTMEirrs 
The subject invention provides novel nucleic 
acid conjugates for inhibiting intracellular mRNA mat- 
s uration and/or expression of a structural gene. Con- 
jugates comprise a relatively short oligonucleotide se- 
quence, a linking group, and a group which modifies the 

^rspfcilie lipophilic balance) to provide an 
aaphiphilic product product. The amphiphilic nature of 
1Q the product aids in the transport of the conjugate 

across the cellular membrane and can provide additional 
advantages, such as increasing aqueous or liquid 
solubility of nucleic acid derivatives, e.g., use of an 
amphophilic group to enhance water solubility of long 
i5 chain methyl phosphonates and stabilizing normal 
nucleic acids to exonuclease digestion. 

For the most part, compounds of this invention 
will have the following formula: 



20 



{X 

{ 

{ 



P(X)Z 

. I' 
N 



P(X)Z 

1 



a 



Y} - L - M 

} 
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X is usually a pair of electrons, chalcogen 
<oxygen or sulfur) or amino, particularly NH; 

13 4 natUrally occ ""ing or synthetic sugar 
residue linked at two of the 2' v q , h , 
fha _, A 6X16 * ' 3 and 5' hydroxyls of 

the five carbon sugars and at comparable sites for six 
carbon sugars, where the sugars will usually be ribose 
or e oxyribose. or other 5 carbon or 6 carbon, parMc ' 
ularly 5 carbon, sugars such as arabinose, xylose, 
glucose, or galactose; 

ovri m ,H, Y" ^ " atUral ° r mMtUKl ta " (Purine or 

a til ° aMble ° f 6In< " n8 t0 .!«, 
a natural purme or pyrlmidtne, the purmes and pyri- 
dines - y be the natural „«« 7Plbo „ nuoleoslde 

a ua„«r S ' SU ° h " ade ° lne - ">™L„e, 

!" a " , " ° t,,er PUr ' ne3 M<1 WrlMKUnes, such as 
uracil, mosine, and the like. 
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L la a linking group which is derived from a 
polyvalent functional group having at least 1 atom, not 
more than about 60 atoms other than hydrogen; usually 
s not more than about 30 atoms other than hydrogen, 
having up to about 30 carbon atoms, usually not more 
than about 20 carbon atoms, and up to about 10 hetero- 
atoms, more usually up to about 6 heteroatoms, particu- 
larly chalcogen, nitrogen, phosphorous, etc., non-oxo- 
1Q carbonyl (carboxy carbonyl), oxo-carbonyl (aldehyde or 
ketone), or the sulfur or nitrogen equivalents thereof, 
e.g., thiono, thio, imidyl, etc. as well as disulfide 
amino, diazo, hydrazine, oximino, etc., phosphate, 
phosphono, and the like. 
t 5 M is a solubility modifying moiety which 

imparts amphiphilic character to the molecule, particu- 
larly hydrophobic with phosphates and amphiphilic with 
Phosphonates, which will have a ratio of carbon to 
heteroatom of at least 2:1, usually at least 3:1, fre- 
quently up to greater than 20:1, may include hydro- 
carbons of at least 6 carbon atoms and not more than 
about 30 carbon atoms, polyoxy compounds (alkyleneoxy 
compounds), where the oxygen atoms are joined by from 
about 2 to 10 carbon atoms, usually 2 to 6 carbon 
^ atoms, preferably 2 to 3 carbon atoms, and there will 
be at least about 6 units and usually not more than 
about 200 alkyleneoxy units, more usually not more than 
about 100 units, and preferably not more than about 60 
units. 

One Y is a bond to L, while the other Y is a 
monovalent oxy. thio, amino, sugar group or substituted 
functionalities thereof, or alkyl of up to about 20, 
usually of up to about 6 carbon atoms, when bonded to 
P, or hydrogen, hydrocarbyl or acyl of from 1 to 30, 
^ usually 1 to 12 carbon atoms, or substituted hydro-' 
carbyl or acyl having from 1 to 4 hetero groups which 
are oxy, thio, or amino when bonded to Z. 
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u ,„ a11 3 iS at least 5 and nofc *oce than about 50, 
usually not more than about 35. 

The phosphorous moiety may include phosphate 
P.o 5P h 0ramldate> ptlosphordiamidafce> 

hosphorothionate, phosphorothiolate. phosphoramido- 
thxolate, phosphonate, phosphorimidate and the l Ike 

The purines and pyridines may include 
thymidine, uracil, cytoslne, 6-methyluracil, 4,6-dihy- 

10 7^?^' iS0Cyfc03ine ' -thine 
adenosine, guanosine, and the like. 

The sugars may be ribose, arabinose, xylyiose 

el7 T VatiVeS thSre0f ' ° ther —sid m y 
also employ hexoses. 

„,„ „ * TarIet7 ° f Unkl!,s « rou I> s ■»»• "8 em- 

Ln«^ fUnoti0 - 11 ^ "I..** for, whether the 

U»M«g group is present during the synthesis of the 

uMlT 0Ua9, P-sent on 

20 Zllll a ° iltnng " U " the »*•• * — br of 
rout r° UPS "* COMeroia1 ^ »~ilibl. and have 
found extensive use for u„ kl ng polyfunctional con- 
pounds. The linking groups include- 
-0 C H ? 0H 2ra C0(CH 2 ) n C0NH- i -bOH 2 CH 2 »H-X-(CH,) NH— 0- 

SCHjCHjCQ-; -C0WTS-, -(NCH 2 CH 2 ) m CH,lT-. 

ly i„T r 7 f anln ° aC " 3 ' 3U * " 

1° 2 00O dair ynethl ° nIne ' St0 - U3Ua " y ° f «"« S00 to 
*,uoo daltons; wherein Y ? c „ , 
. . in x 13 2 '5-quinondiyl t Y is 

>o. usuiuy : L' 6 Bore usuaiiir 2 to ,2 - - 

35 variety oTl Ufe * iU <""»>"»M Sroup may be a „ide 
varxety of groups, heing aliphatic, aromatic, elioyc- 

.Least b, more usually at ip a «- 10 

y ac J-easu 12 and not more than 
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about 500, usually not more than about 200 carbon 
atoms, having not more than about 1 heteroatom per 2 
carbon atoms, being charged or uncharged, including 
alkyl of at least 6 carbon atoms and up to about 30 
carbon atoms, usually not more than about 24 carbon 
atoms, fatty acids of at least about 6 carbon atoms, 
usually at least about 12 carbon atoms and up to about 
24 carbon atoms, glycerides, where the fatty acids will 
iq generally range from about 12-24 carbon atoms, there 

being from 1-2 fatty acids, usually at the 2 or 3 posi- 
tions or both, aromatic compounds having from 1 to 4 
rings, either mono- or polycyclic, fused or unfused, 
polyalkyleneglycols where the alkylenes are of from' 
^ 2-10, usually of from 2-6 carbon atoms, more usually 
2-3 carbon atoms, there being usually at least about 6 
units, more usually at least about 10 units, and usual- 
ly fewer than about 500 units, more usually fewer than 
about 200 units, preferably fewer than about 100 units, 
2q where the alkylene glycols may be homopolymers or co- 
polymers; alkylbenzoyl, where the alkyl group will be 
at least about 6. carbon atoms, usually at least about 
10 carbon atoms, and not more than about 24 carbon 
atoms, usually not more than about 20 carbon atoms ; 
25 alkyl phosphates or phosphonates , where the alkyl group 
will be at least about 6 carbon atoms, usually at least 
about 12 carbon atoms and not more than about 24 carbon 
atoms, usually not more than about 20 carbon atoms, or 
the like. 

The solubility modifying group may be charged 
or uncharged, preferably being uncharged, under physio- 
logical conditions, usually having not more than 1 
charge per 10 atoms of the group other than hydrogen. 
Illustrative groups include polyethylene glycol having 
from about 40-50 units, copolymers of ethylene and 
propylene glycol, laurate esters of polyethylene gly- 
cols, triphenylmethyl, naphthylphenylmethyl , palmitate, 
distearylgiyceride, didodecylphosphatidyl , cholesteryl, 
arachidonyl, octadecanyloxy , tetradecylthio , etc. 
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Functionalities which may be present include 
°xy, thio, carbonyl (oxo or non-oxo) , cyano. halo, 
nitro, aliphatic unsaturation, etc. 
■ Of particular interest will be oligonucleotide 

conjugates of the following formula: 



C? 1 | P(X 1 )Z 1 

f ii 



P(X 1 )Z 



J-1 



Y 1 } - L 1 - M 1 
} 

J } 



10 



15 



20. 



X 1 is nitrogen or oxygen; 

Z 1 is ribose or deoxyribose substituted at the 
3 and 5' positions; 

One Y 1 is a bond to L 1 and the other ^ la hy- 
droxy, alkyl, alkoxy or amino (including substituted 
amino, e.g., alkyl, acyl, etc.) of from 0 to 3 carbon 
atoms or a five carbon sugar, particularly ribose or 
deoxyribose, when bonded to P and hydrogen, alkyl, or 

bo", 1 f„ fr °? 1 fc ° 10 ' USUaIly 1 t0 6 Carbon at °*s 
bonded to Z'j 

H? is any purine or pyrimidine which can 
hybridize to the naturally occurring purines and 
pyridines, but is preferably a naturally occurring 
purine or pyrimidine; 

25 L is a linking group of at least about 2 

carbon atoms and not more than about 30 carbon atoms • 
usually not more than about 20 carbon atoms, having 
from o-io. usually 1-6 heteroatoms, which will be 
oxygen, nitrogen, and sulfur, particularly as oxy 
amino, or thio ; 

30 t 

M is the solubility modifying moiety, hydro- 
phobic or amphiphilic, which is desirably a polyalkyl- 
eneoxy group of at least- about 20 units and not more 
than about 200 units, normally not more than about 150 

35 atoms'' ^ ^ alknSne ' m ° f fr °" 2 ~ 3 cart - 
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a is at least 5, usually at least 7 and gen- 
erally not more than about 50, usually not more than 
about 30, more usually ranging from about 1 1 to 30 
preferably from about 13 to 30. 

In preparing the subject compositions the 
oligonucleotide and the solubility modifying moiety 
will usually exist as independent moieties and may be 
joined together by a linker arm. The oligonucleotide 
iQ may be made by any convenlent synthefcic procedure> , por 

the mo3t part, recombinant procedures will not be em- 
Ployed, although in S ome situations they may be useful. 
Various commercial synthetic devices for preparing 
Polynucleotides are available from a number of compa- 
i5 mes, such as Applied Biosystems Inc., Biosearch, Inc. 
and Pharmacia, a variety of procedures are known for 
employing blocked oligonucleotides as their triesters 
Phoaphoramidites, phosphonates, or the like,, where a ' 
cycling procedure is employed, and the individual 
2Q nucleotides are added in succession. 

At the completion of the synthesis, various 
protocols may be employed. Preferably in most cases, 
the terminal blocking group may be removed and the 
linking arm joined to the terminal nucleotide 
Alternatively, all of the blocking groups may be 
removed and the terminal nucleotide modified, by 
addition of the linking arm, where the linking arm may 
be specific for the final oligonucleotide, m some 
instances, the terminal blocking group may serve as all 
or part of the linking arm. Alternatively, the 
oligonucleotide may be removed from the support and 
then manipulated further, particularly where the 
Unking group to the support may be used as the linking 
arm for joining the hydrophobic modifying moiety 
^ Various procedures for further nationalization of the 
or 3 '-termini of oligonucleotides may be found in 
Ch« and Orgel DNA (1985) 1:327-331; Connolly and Rider 
Nucl. Acids Pec (1985) ±3: 1HI85-U502. 
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Depending upon the functionalities, various 
reactions may be employed to produce amides, esters, 
both inorganic and organic, oxygen and sulfur ethers 
amines, or the like. In „ orklng Mitn oa ' 
varxoue activating groups may be employed, such as 

oarbocyldiimidazole, carbodiimides, succinimidyl ester 
_ESra.-<utrophen yl eater> ^ ^ ester, 

Various active functionalities can be em- 

™ IIZ t SU ° h M UOeya " ates - *™ups, immo 

chlorides immo esters, anhydrides, acyl halides, 

etc c ^t 1 " 63 ' l3 ° thl °^anates, aulfonyl 

etc Conditions for carrying out the various reactions 

» joining non-nucleotide moieties to nucleotide 

15 T.tt ^ f ° Und ^ ChU ° rgel S«k (1985) 

JL-327-331 ; Smith, et al. Duel. Acid.-,, p.. ( jo o5 ) 

to the iJT- SOll " ,U " ty .-<iWn« "Oiety may be added 
to the Unking arm 9ither prlor t<)i • 

2o -currently with the addition, of the linking arm to 

the oligonucleotide. For the most part, the solubility 
modifying molety „ U1 be aMed ^ " 

tion of the linking arm to the oligonucleotide L 
some mstances. it may be desirabie to Join the solu- 

1 nklng arm Is bound to the oligonucleotide while the 
o ^nucleotide is still bound to the support. 3 
already indicated, the reactions between the linking 
arm and the solubility modifying moiety win with 
3° h !T'T lr fUn0 " 0I,al P~«*. the na ur of 

required, and the like. 

-lid and'ir: B0St . PaTt ' conditions will be 

of polar and " 3 ° 1Ve ° t3 ° r """-tiens 

35 and inci T nm ~ P ° laC S0l * e " t3 - S ° lvents »»1 -ry 
ethyl eth aCet0nitrlle - "-thyjformamide, di- 

U1 ba for the Pa" in the range of about 
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-TO to 60«C. Usually, after completion of thereaction 
between components of the conjugate, the resulting 
product will be subjected to purification. 

The manner of purification may vary, depending 
upon whether the oligonucleotide is bound to a support. 
For example, where the oligonucleotide is bound to a 
support, after addition of the linking arm to the oli- 
gonucleotide, unreacted chains may be degraded, so as 
^ to prevent their contaminating the resulting product. 
On such cases, the bonding of the linker to the oligo- 
nucleotide must be sufficiently stable to withstand the 
cleavage conditions from the synthesis support, e.g., 
cone, ammonia. Where the oligonucleotide is no longer 
bound to the support, whether only reacted with the 
linking arm or as the conjugate to. the solubulity 
modifying moiety intermediate or as the final product, 
each of the intermediates or final product may be puri- 
fied by conventional techniques, such as electrophore- 
2q sis, solvent extraction. HPLC, chromatography, or the 
like. The purified product is then ready for use. 

The subject products will be selected to have 
an oligonucleotide sequence complementary to a sequence 
of interest. The sequence of interest may be present 
in a prokaryotic or eukaryotic cell, a virus, a normal 
or neoplastic cell. The sequences may be bacterial se- 
quences, plasmid sequences, viral sequences, chromo- 
somal sequences, mitochondrial sequences, plastid se- 
quences, etc. The sequences may involve open reading 
^ frames for coding proteins, ribosomal RNA, snRMA , 
hnRNA , introns, untranslated 5»- and 3 '-sequences 
flanking open reading frames, etc. The subject 
sequences may therefore 'be involved in inhibiting the 
availability of an RNA transcript, inhibiting expres- 
sion of a particular protein, enhancing the expression 
of a particular protein by inhibiting the expression of 
a repressor, reducing proliferation of viruses or neo- 
plastic cells, etc. 
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The subject conjugates may be used in „<, 
«. vivo for modifying the pheno ^ ™^^a or 

*e proUferatlon of pathogens such as n u „ " * 
ena, protista. myooplasma _ or th e 

10 or expressio7of t h 9 „° " and/ 

sublet " 9 genes of 6I " cell. Tho 

aubjeot compositions may be used - » 
variety of pathos , Protection from a 

UI patnogens ln a mammalian h n „ 
toxigenic bacterid p„„ host, e.g. , entero- 

protists, su 0h as iiLir 00 : 00 " 3 ' Mei33eira - eE °- 

15 cans, such as oarc 1 EntM ° e,,a ' •*•■» neoplastic 
aupressor ..u.^*: ^ " ■ 

be oapabU 1 ^ 3 "':" " ,Uen ° e3 ^ »° 3el6 "°< « « to 
20 -turatlon or express"" 8 ^ '"duct 
-danisms iLo" Id "ith L My »' »• 

composition to Its Jr~t ""^ ° f ^ S ^ 
mr inciude Int rfer e „ 0 rt «T'- - h " 1 « 
transport across the Pr °° 93SlnS - ^Ibltlon of 

■ * -donucxeases. or the uL 0lMT - " 

— -^^STIS * «"«~* to 

IvmphoKines. immunogXobu Ins Toe"" 8 Sr °" th 
«HC ant!gens, b»a or r„a „ , "°*Ptor sites, 

3ff tance. muUlpie < " B «/ 0l '»«ses. antibiotic resls- 
«th metabolic p ooeMes Tl ' ' 

-ids. nudelc 1^'^ ZT^Z" ^ ' 

as introns or fianirfn., ' tC * 33 wel1 

3s open reading r^" * 33 °""" ™ «» 

cne subject compositions. 
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THERAPEUTIC APPLICATIONS OF 
SrHTHETIC DMA TECHNOLOGY 



Area of Application 



specific Application Targets" 



10 



15 



20 



Infectious Diseases: 
Antivirals, Hunan 

Antivirals, Animal 



Antibacterial, Human 
Antiparasitic Agents 

Cancer 

Direct Anti -Tumor 
Agents 

Adjunctive Therapy 



Auto Immune Diseases 
T-cell receptors 



Organ Transplants 



AIDS, Herpes, CMV 

Chicken Infectious Bronchitis 

Pig Transmissible Gastroenteritis 
Virus 

Drug Resistance Plasmids, E. coli 
Malaria 

Sleeping Sickness (Trypanosomes) 

c-myc oncogene - leukemia 
other oncogenes 

nrS°p re ?\ te Re3i3t ance - leukemia 
Drug Resistant Tumors - 
drug transport 

Rheumatoid Arthritis 
Type I Diabetes 
Systemic Lupus 
Multiple sclerosis 

Kidney - 0TK3 cells cause GVHD 



25 



30 



The subject compositions may be administered 
to a host in a wide variety of ways, depending upon 
whether the compositions are used in vitro or in vivo 
vitro., the compositions may be introduced inTo""!^ 
nutrient medium, so as to modulate expression of a par- 
ticular gene by transfer across the membrane into the 
cell interior such as the cytoplasm and nucleus. The 
subject compositions can find particular use in pro- 
tecting mammalian cells in culture from mycoplasma, for 
■nodxfy lng phen otype for research purposes, for evalu- 

35 eff6Ct ° f Variafcl ° n ° f eXpreSS lon °« «rio«. 

metabolxc processes, e.g., production of particular 

Products, variation in product distribution, r he 
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like. While no particular additives are necessary for 
ra„ S p t of the subjecfc 

5 :"tir- co : po3itions may be modified * - 

5 T liposomes or other veslcl -« be 

uwd xn conjunction- with permeabilizing agents, e"g. 
non-xonxc detergents, Sendai virus, etc. 

Dar .,. , administration, depending upon its 

particular pumose i-ho <,,.»>• ^ 

purpose, the subject compositions may be 

tatnT\ n ' et ° : ' S ° thi " ' ha --•-"ion. may be 

taken oraUy. intravascular^, intraperitoneal^ JJ. 

sitlons „ay be formulated in a variety of'„ays oe .„° ' 
<Uspersed 1B various physiologically accept a med a 
u= aa deionized MUr , Mater> p^,"^ 

lu" , ' a,U8 ° U3 ethano1 ' « formatted in the 

lumen 0 f vesicles, such as Upoac.es or albumin 

microspheres. 

20 Banners oTlZTJl " '"'^ °' an, 

nners of administration, no particular composition 

" t 3Ug!9SUa :. to each indication, the 

subject compositions may be tested in convention 1 Ly 3 
a* the appropriate concentrations determined ampin 
2s cally. other additives may be included, such as sta- 
Ollizers, buffers urtrtin , .. 

lent, ► ur ™«. additional drugs, detergents, excip- 
ien s. etc. These addltlvea are conventional, and 
would generally be present in less than about j £j 
usually less than 1 v«, being present In an eff.oUve 

—rati:: r^ 1 * rsruTj~ * - " 
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EXPERIMENTAL 
EXAMPLE 1 

Synthesis of Polyethylene Glycol Derivatives of 
. Normal DNAs Using Aminolink, Benzoquinone and 
Bis-(Aminohexyl) Polyethylene Glycol 

Chemical Synthesis of DNA oligonucleotides by the 
Amidite Method. 



The chemical synthesis of DNA can be carried 
out using slight modifications of the conventional 
phosphoramidite methods on any commercially available 
DNA synthesizer. This method is a modification of the 
■ technique described by Caruthers and coworkers 
15 (Beaucage and Caruthers, Eur. Pat. Appl. 82/102570. 

In 'this technique, 0.1 M nucleoside phos- 
phoramidites dissolved in anhydrous acetonitrile were 
mixed with an equal volume of 0.5 M tetrazole and se- 
quentially coupled to the 5 f -hydroxyl terminal nucleo- 
tide of the growing DNA chain bound to controlled pore 
glass supports via a succinate spacer (Matteucci and 
Caruthers, Tetrahedron Letters (1980) 21 ;71 9-22. 
Nucleoside addition was followed by capping of unre- 
acted S'-hydroxyls with acetic anhydride, iodine oxi- 
25 dation, and 5 T -detri tylat ion in trichloroacetic acid- 
methylene chloride. The resin-bound oligomer was then 
dried by extensive washing in anhydrous acetonitrile 
and the process repeated. Normal cycle times using 
this procedure were 12 minutes with condensation 
efficiencies of >98J (as judged by trityl release). 

As the last step of the synthesis, trityl was 
removed from the product' oligonucleotide chains arid an 
aminoethanolphosphoramidite was added to the 5 f - 
hydroxyl using Aminolink (Applied Biosystems, Foster 
City, CA) . The resin-bound oligonucleotide was then 
deblocked and released from the column using a method 
appropriate to the type of phosphate linkage present. 
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For normal pnosphodiesters, release from the column and 
hydrolysis overnight at 55'C in concentrated ammonium 
hydroxide was appropriate. 

5 fPOB ,n« Pr ° dUCt ly °P hiliz ^ several times 

from 50% aqueous ethanol and purified via reversed 
Phase HPLC C-8 silica columns, eluting with 5 to 50? 
acetonitrile/25 mM ammonium acetate, p H 6.8 in a linea, 
gradient, if required, the material may be further 
purified by ion-exchange HPLC on Nucleogen DEAE 60-7 
eluting with 20* acetonitrile/25 mM ammonium acetate, 
PH 6.5. The recovered product was then characterized 
by gel electrophoresis on 15? polyacrylamide gels 
oarrl ed out as described by Maxam and Gilbert m Meth- 
^ 3o fEn2m1n (1980)ji; ^ 560| OligonucleoIId^ 
xn finished gels were visualized using Stains-all. The 
Stains-All procedure did not work for uncharged oligo- 
nucleotides such as DNA methylphosphates or ethyl 
triesters. 

2° h,. 11,6 f " ly debl00k8d «< P«ified product is 

then converted to the appropriate polyethylene glycol 
derivative using a suitable coupling technics. Sev- 

clrbodTT" 3 ^ inC1U " n8 Wincne. 

SMCC CSuccini«idyl «-<K- m alei m ideoaeth- 
n>-=yclohexane-,-carboxylate, SPDP 0.- 5 uccini„idyl 3- 
2-pyrldyldnhio)prop 10 „ate. carbonyldii oid azole 
Aminolink. disuooinintdyl suberlaidate and 
phenyliaooyanate . 

■ g°uplin K of the link.. bmam|t 

«ycol is IT* f r " bis - (a " lno "«rt)POlyetnylene 
glycol „ reacted „ith a ,00 to ,000 fold -clear excess 
of oenzccuinone.ln 0., „ S odtu D bicarbonate (pH 8.5, 
«ter , hour at. roc* temperature, the excess Lea ed 

Ld 8 e o 0 , MY:""" 8 ' '^col 1. then 

DMA », m bloarto "«- and reacted „ith the 

° llg01,er '"^1 ■» "active aaine linker .„ I„ 
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a molar ratio of 10:1 and the reaction allowed to pro- 
ceed to completion. At the end of the reaction (gener- 
ally overnight) the unreacted oligomer is removed by 
5 gel-filtration on Sephadex G-100 and the complex char- 
acterized by polyacrylamide gel electrophoresis (cf 
Maniatis, etal., Molecular cloning, A laboratory 
manual (1982) Cold Spring Harbor Laboratories, Cold 
Spring Harbor, MY) . Further purification can be 
iq effected using ion-exchange chromatography and gel 
electrophoresis as required. 

The structure of the product of these 
reactions is: 

? 

J C C - NH - (CH 2 ) 6 - PEG (3500) 



20 



oligomer - P - o - (CH 2 ) 2 - NH - C 6 

« V 



EXAMPLE 2 

Synthesis of Polyethylene Glycol n«n w atlirea of Hnr . ma1 
25 DNAs Using Amlnolink ,nd Carbonvld i 1,1^1. 

Activated P olyethylene Glycol 
In this example the Aminolink oligonucleotide 
was synthesized as described in Example I. After re- 
moval of the oligomer from the support and deblocking 
xn ammonia, the solution was evaporated in vacuo and 
dissolved in 0.1M NaHCOS, P H 8.5 and purified a G25- 
spun column to convert the. material to the sodium salt 
and to remove any extraneous amine-containing material 
of low molecular weight. The solution was then made to 
35 0.2 M in carbonyldiimidazole-activated polyethylene 
glycol (MW av - 20,000) and allowed to react overnight 
at 23°C. 
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Unbound oligonucleotide was removed by gel 
filtration on Sephadex G-100. On this column the 
complex elated in the excluded volume of the column 
5 *hile the free oligonucleotide and unbound polyethylene ■ 
glycol were retained. This material was then 
concentrated in vacuo and the complex characterized by 
polyacrylamide gel electrophoresis (Maniatis et al 
(1982), supra . ' 

10 

EXAMPLE 3 

Synthesis of Polyethylene fllwm r + virat tv.« n r 
Normal DMAs Usin, Phn.phoramidats T.m^ T^T^ 
N-Hydroxysucolnlmidvl Anhf, a , e d Po i Vflt , hv1 ^^^7 
15 In this mefchod °NA is synthesized as in Ex- 

ample 1 with the exception that the trityl group is 
removed without the further addition of the Aminolink 
Phosphoramidite. After purification by polyacrylamide 

within C \ r ° Ph0reSi3 ' thS Pr ° dUCt DNA is P^osphorylated 
20 H 1 reaCti ° n ° f T * P 01 ^oleotides kinase 

according to standard procedures (Miller et al., Nucl 
A£l^_Res, ( 1983 ) ^6225^2, Maniatis .^ (7^ 

(1980)71:560-5. Labeled oligomers can be separated 
^ from unreacted ATP by DEAE chromatography and C-18 
reverse phase columns (e.g. Waters C-18 SepPak) 
Samples are checked for purity on analytical 20* 
polyacrylamide gels. 

The phosphorylated oligomer is then treated 
1 "-^imidazole and hexanediamine, in the pres- 
ence of EDC carbodiimide according to the method of Chu 
and Orgel DNA ( 1985) 1:327-31 . This reaction 
covalently couples the diamine linker to the 
oligonucleotides via a phosphoramidate linkage with the 
following structure: 
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0 

oligomer - P - HH - (CH ) -NH 

I 2 6 2 

OH 

5 The amine linker arm oligomer is then conju- 

gated to NHS-succinylmonomethoxypolyethylene glycol (MW 
5000) as follows. The oligonucleotide is dissolved to 
a final concentration of 100 P M per liter in 50 fflM 
sodium phosphate buffer, p H 7.1 containing 0.15M 
10 Had. To this solution a 10 fold molar excess of SS- 
PEG (5000) is added as a dry solid, allowed to dissolve 
and the reaction mixture incubated overnight at 25°c 
The product is then purified by gel filtration chro- ' 
matography on Sephadex 0-100 in water and charactered 
15 by polyacrylamide gel electrophoresis. 

The structure of the final product is- 

I ° ° 

oligomer - j» - m . (CH ^ _ ^ _ j _ _ |._ q _ ^ ^ 
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EXAMPLE 4 

Synthesis of PoT yethvlene 01 vo ol Derivative ^ 
Normal DMAs rr.,<n g Ijaidazole , nM vat9d Carhnrv , t#t 
Acid Esters and Rf ,-Aminoalkvl Pni ye t hvlene nivr , n , 

In this example, DNA was synthesized according 
to the method given in Example 1. After synthesis, the 
Product material was retained on 'the synthesis support 
with trityl removed from the 5' end of the molecule 
The solid material was then thoroughly washed with an- 
30 hydrous acetonitrile and blown dry under a stream of 
dry argon. Using a plastic syringe, 1 cc of 0. 3 M car- 
bonyldiimidazole dissolved in anhydrous acetonitrile 
was pushed slowly through the synthesis column contain- 
ing the support bound oligomer over the course of 1 
35 hour. The 5' carbonylimidazole activated oligomer on 
he column was then washed free of excess reagent with 
ml of acetonitrile and subsequently treated for 16 
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hours with 0.1 M bis (aminohexyl) polyethylene glycol 
in acetonitrile, water, acetonitrile and methylene 
chloride in succession. The polyethylene oligomer 
conjugate was then eluted with concentrated ammonium 
hydroxide and deblocked in the same by incubation at 
55 °C for 5 hours. 

The reaction product is then purified by high 
performance gel filtration chromatography (HPGFC) on a 
TSK 64000SW column eluting 10mM Tris, pH 7.5 at 0.5 ml 
per minute. Further purification may be effected by 
agarose gel electrophoresis. The structure of the 
final conjugate synthesized by this method is- 



TO 



oligomer - 0 - C - NH - (CH ) - NH - c - (ru ■» r \ „ , 

vv "y g m ~ lCH 2 ^ 6 - C - 0 - PEG (5000) 

EXAMPLE 5 

Synthesis of Long Chain jmr, ne Derivative n r 
20 Normal DNAs Using Imida*m e Activate 

Carboxylic Acid Esters and Aminoal 
In this example, a 20 nucleotide DNA comple- 
mentary to the initiation region of mouse 3-globin mRNA 
was synthesized according to the method given in Ex- 
25 ample T . After synthesis, the product material was re- 
tained on the synthesis support with erityl removed 
from the 5- end of the molecule. The solid material 
was then thoroughly washed with anhydrous acetonitrile 
and blown dry under a stream of dry argon. Using, a 
3Q Pla 3 tic syringe, l cc of 0. 3 M carbonyldiimidazole dis- 
solved in anhydrous acetonitrile was pushed slowly 
through the synthesis column containing the support- 
bound oligomer for 4 5 minutes. The 5' carbonylimida- 
zole activated oligomer on the column was then washed 
35 free of excess reagent with T5 ml of acetonitrile and 
then treated with 0.2 „ decanediamine in acetonitrile- 
water (10:T) for 30 minutes. 
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The material on the column was washed free of 
unreacted decanedlamine with acetonitrile and water and 
then eluted from the column in concentrated ammonium 
g hydroxide solution. After removal from the column, the 
ammonium hydroxide solution containing the oligomer 
conjugate was placed in a sealed vial and incubated 5 
hours at 55°C. 

The product was then lyophilized several times 
^ from 50% aqueous ethanol and purified via reversed 
phase HPLC C-8 silica columns eluted with 5 to 50% 
acetonitrile/25mM ammonium acetate, pH 6.8 in a linear 
gradient.' If required, the material may be further 
purified by ion-exchange HPLC on Nucleogen DEAE 60-7 
^ using 20$ acetonitrile/25 mM ammonium acetate, pH 6.5 
as eluent. The recovered product was then character- 
ized by gel electrophoresis in 15% polyacrylamide gels 
carried out as described by Maxam and Gilbert in Meth. 
Enz ? mo1 ; < 1 9 8 0> 68-*99-560. Oligonucleotides in 
finished gels were visualized using Stains-all. 

The presence of a primary amine was determined 
by two methods. First, reaction with f luorescamine 
produced a fluorescent product characteristic of the 
presence of a primary amine while no fluorescence was 
25 observed with similarly treated control oligomers of 
the same type but lacking the amine linker. Second, 
the decane conjugate was dissolved in 100 ul 0.1 M 
sodium bicarbonate to which was added i mg of fluores- 
ceinisothiocyanate (FITC). After 1 hour of incubation, 
the unreacted FITC was removed by gel filtration 
chromatography on Sephadex G-25 spun columns. The 
product was then analysed by polyacrylamide gel 
electrophoresis as described above and the fluorescent 
band product visualized under UV illumination. A 
35 single fluorescent band was observed which corresponded 
to the oligomer visualized by subsequent staining with 
Stains-all . 
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The product of this reaction is an alkyl car- 
bamate which is stable to moderate exposure to concen- 
trated base. The structure of the final conjugate syn- 
thesized by this method is: 
5 0 

I! 

oligomer -0-C-NH- (CH ) - NH 

2 10 2 

Other monoaminoalkyl and aryl derivatives can 
be produced by this method. Other molecules in this 
series which have been constructed include the deriv- 
atives made with ethylene diamine and hexane diamine. 
Higher chain length additions may require slight 
modifications of the solvent polarity in order to 
15 achieve the necessary concentrations. Alternatively, 
lower concentrations may be used if the reaction times 
are extended. 
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EXAMPLE 6 

Synthesis of Polyethylene Glycol Derivatives of 
DNAs Usi ng Imidazole-Activated Carboxylic 

Acid Esters. Polvlyslne Linker. 
DSS AND BI S-Aminoalkyl Polyethylene Glycol 

In this example, a 25 nucleotide DNA comple- 
mentary to the initiation region of mouse 8-globin mRNA 
was synthesized according to the method given in Ex- 
ample 1. After synthesis, the synthesis support was 
treated with 80$ acetic acid for 30 minutes to remove 
trityl from the 5» end of the molecule. The solid ma- 
terial was then thoroughly washed wi th anhydrous aceto- 
nitrile and blown dry under a stream of dry argon and 
treated with 0.3M carbonyldiimidazole as in Example 4. 
The 5' carbonyldiimidazole-activated oligomer on the " 
column was then washed free of excess reagent with 15 
ml of acetonitrile and then treated with 0.2M poly-L- 
lysine (MW-1000) dissolved in 50* acetonitrile contain- 
ing 0.1M sodium phosphate, pH 8 for 1 6 hours at room 
temperature. 
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The material on the column was washed free of 
salts and unreacted polylysine with water and aceto- 
nitrile and then eluted from the column with concen- 
trated ammonium hydroxide. After removal from the 
column, the ammonium hydroxide solution containing the 
oligomer conjugate was incubated 5 hours at 55°C in a 
sealed glass vial. The product was then lyophilized 
several times from 50? aqueous ethanol and purified via 
gel filtration chromatography on TSK G4000SW in 10 mM 
Tris buffer, pH 7.5. The presence of a primary amine 
was determined by reaction with f luorescamine. No 
fluorescence was observed with control oligomers 
lacking the polyamine linker. 

In order to render the polyamine conjugate 
negatively charged, the complex was reacted with FITC 
to label the molecule and to neutralize the positive 
charges on the amines. This was accomplished by dis- 
solving a portion of the material in 100 yl 0.1M sodium' 
bicarbonate to which was added 1 mg of FITC. After 1 
hour of incubation, the unreacted FITC was removed by 
gel filtration chromatography on Sephadex G-25 spun 
columns (Maniatis et al., (1982)., supra > The product 
was then analysed by polyacrylamide gel electrophoresis 
carred out as described by Maxam and Gilbert (1980) 
supra and the fluorescent band product visualized under 
UV illumination. A broad fluorescent band was observed 
which corresponds to the DNA visualized by Stains-all. 

The oligomer containing polylysine covalently 
linked to the 5 T end of the molecule was then cross- 
linked to bis-(aminohexyl) polyethylene glycol (MW = 
3500) as follows. The polylysine oligomer is first 
dialysed against 0*1 M -sodium carbonate, 3M NaCl and 
concentrated to a final concentration of 4 mg/ml using 
a Centricon 10 apparatus (Amicon, Danvers, N.J.). To 
50 \il of this solution was added 25 pi of disuccinimi- 
dyl suberate (DSS, 10 mg/ml in DMS0) and the mixture 
incubated 10 minutes at room temperature. The unre- 
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acted DSS was then quickly removed by gel filtration on 
Sephadex G25 and concentrated on Centricon 10 mem- 
branes- The solution was then made to 0.2M in bis- 
(aminohexyi) polyethylene glycol and incubated over- 
night at room temperature to form the final conjugate. 
Purification was effected on TSK G4000 SW columns 
operated as previously described. 

This conjugate has the following general 

formula: 

1 . Formulation Type I 

0 0 0 

1 II if 

oligomer - 0 - C - NH - (CH - C - NH) -CCHC00H 

(CH ) (CH ) 

15 | 2 4 | 2 4 

NHX NH' 

Where X is usually H, at least one X being 
-C0(CH 2 )gC0HN-PEG 5000 . 

20 

By varying, the reaction excess or the molecu- 
lar weight of the polyethylene glycol and the poly- 
lysine used it is possible to construct polymer conju- 
gates with varying degrees of substitution size and 
25 charge. The ability to vary these properties of the 
complex make it possible to design the use of the com- 
pound in various applications. 

EXAMPLE 7 

30 Synthesis of Polyethylene Glycol 

Derivatives of DNA Methylphosphonates 
The chemical synthesis of DNA methylphospho- 
nates (MP) may be carried out using a modification of 
the phosphochloridite method of Letsinger (Letsinger et 
35 al- t J, Amer. Chem. Soc. (1975) 9£:3278; Letsinger and 
Lunsford, J. Amer. Chem, Soc. (1976) 98;3605-366l ; 
Tanaka and Letsinger, Nucl. Acids. Res. (1982) 25 :3249- 
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60. In this procedure, dried blocked nucleosides dis- 
solved in anhydrous acetonitrile 2,6-lutidine , are 
activated in situ with a stoichiometric amount of 
methyl dichlorophosphine . The activated nucleoside 
monochloridites are then added sequentially to the 5 r 
hydroxy terminal nucleotide of the growing DNA chain 
bound to controlled pore glass supports via a succinate 
spacer (Matteucci and Caruthers, Tetrahed. Lett. (1980) 
21^:719-722. Each addition is followed by capping of 
unreacted 5 T -hydroxyls with acetic anhydride, iodine 
oxidation, and 5 1 -detri tylat ion in 3* trichloroacetic 
acid-methylene chloride. 

The resin-bound methylphosphoriate oligomers 
are then dried by extensive washing in anhydrous aceto- 
nitrile and the process repeated. Normal cycle times 
using this procedure are 23 minutes with condensation 
efficiencies of >32$ (as judged by trityl release). 
The ultimate base may be added as the cyanoethyl phos- 
photriester which yields, upon cleavage in base, a 5 f - 
terminal phosphodiester . This step makes it possible 
to radiolabel the oligonucleotide, purify and sequence 
the product using gel electrophoresis at intermediate 
stages of preparation (Narang et_ al . , Can. J. Biochem. 
(1 975) _53_ :392 ~ 394 * Miller et al. , Wuol. Acids Res. 
(1983) 1 1 ;6225-6242. 

An amine-terminated linker arm is then added 
as follows. Trityl is removed as before and the resin 
treated with 0.2M Aminolink (Applied Biosystems, Foster 
City, CA) dissolved in dry acetonitrile containing 0.2M 
dimethylaminopyridine for 5 minutes. The linker arm 
oligonucleotide is then oxidized in iodine and washed 
in acetonitrile as above. Capping with acetic an- 
hydride is not performed since any deblocked primary 
amine would be modified to the base-stable acetamide 
and thus be unavailable for further reaction. 

At the end of the synthesis, the amine termi- 
nated linker arm methylphosphonate oligomer is base 
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deblocked as follows. The resin containing the DNA is 
removed from the column and placed in a water jacketed 
column , and incubated in 1-2 ml phenol .-ethylene diamine 
(4:1) for 10 hours at 40°C. At the end of the incuba- 
tion in phenol rethylene diamine, the resin is washed 
free of the phenol reagent and base protecting groups 
released using methanol, water, methanol and methylene 
chloride in succession. After drying in a stream of 
nitrogen, the intact, base-deblocked chains are cleaved 
from the support using EDA:ethanol (1:1) or a brief 
treatment at room temperature with ammonium hydroxide. 

Purification of the amine-terminated DNA 
me thylphosphonate is then performed as follows. The 
material is first lyophilized several times from 50% 
1 5 aqueous ethanol and purified via reversed phase HPLC 
C-8 silica columns eluted with 5 to 50% acetonitrile/ 
25mM. ammonium acetate, pH 6.8 in a linear gradient. 
Amine-containing fractions, as determined by f lucres- 
camine reactivity, are pooled and the product recovered 
^ U by drying in vacuo and further purified by ion-exchange 
HPLC on Nucleogen DEAE 60-7 eluted with 20? aceto- 
nitrile/ 25mM ammonium acetate, pH 6.5. 

The purified product is then converted to the 
appropriate polyethylene glycol derivative using the 
2 ^ heterobifunctional crosslinking agents SMCC and SATA 
(succinimidyl S-acetylthioacetate ) . Reactions using 
other reagents which can react with and modify the 
nucleoside bases (e.g. sulfonyl chlorides, glutaralde- 
hyde or acid anydrides) are not recomended unless per- 
formed with the fully blocked oligonucleotide still 
bound to the synthesis support. 

The DNA methylphosphonate containing 5 f ter- 
minal reactive amine linker arms is first reacted with 
SATA in a 100-1000 fold molar excess at pH 8.5 (0.1M 
35 sodium bicarbonate) . After 30 minutes at room temper- 
ature, the excess unreacted SATA is removed by G-25 
column chromatography in water, concentrated in vacuo 
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and stored cold until ready for further reaction. Bis- 
(aminohexyl) polyethylene glycol is converted to the 
raaleimide derivative " by treatment with a 100-1000 fold 
molar excess of SMCC in 0.1M phosphate buffer, pH 6.9 
5 for 1 hour at room temperature. Excess crosslinking 
agent is removed by chromatography on Sephadex G-100 
and the material concentrated in vacuo and stored cold 
until ready for further reaction. This material is 
stable for about one week when kept cold. The SATA DNA 

10 methylphosphonate is then treated with hydroxylamine 
HC1 dissolved in 0.1M phosphate buffer (pH adjusted to 
7.2) for 1-2 hours. This treatment serves to release 
the reactive sulfhydryl. This product is then reacted 
overnight with a 10 fold molar excess of bis-(SMCC 

15 arainohexyl) polyethylene glycol by addition of the lat- 
ter as a powder to the solution containing the 
oligomer. 

Purification of the ccTmplex is then effected. 
Unbound oligonucleotide is removed by gel filtration on 
20 Sephadex G-100 or HPGFC on TSK G400SW eluted with 10mM 
Tris, pH 7.5. The diagrammatic structure of the final 
product of this procedure is: 



25 oligomer - P - 0 - (CH 2 ) 2 - NH - C - (CH 2 ) - S 
MP I 1 

OH /~\ 

0 - C C - 0 

\ y 

N 

30 PEG (3500) 

EXAMPLE 8 

Synthesis of Polyethylene Glycol Derivatives of 
DNA Alkyltriesters Using the Phosphoramidite Approach 
35 The synthesis of the title compound triesters 

is performed according to the method of Zon and co- 
workers (Gallo et al., Nucl. Acids. Res. (1986) 
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14 ;7U05-20; Summers £t al . , Nucl. Acids Res. (1986) 
J_4 :7421 -36 • The method of synthesis is similar to that 
used for in situ production of ethyl trlesters as de- 
scribed in the other examples . Fully blocked dimeth- 
oxytrityl nucleosides are dried' by repeated lyophili- 
zation from benzene, dissolved in anhydrous aceto- 
ni trile/2 ,6-lutidine and added dropwise to a stirred 
solution of chlorodiisopropylaminoethoxyphosphine in 
the same solvent at -70°C. The product is recovered by 
aqueous extraction, drying in vacuo and silica gel 
chromatography. 

The chemical synthesis of DNA ethyl triesters 
(ETE) can be carried out using slight modifications of 
the conventional phosphoramidite methods. In this 
technique, nucleoside phosphoramidites dissolved in 
anhydrous acetonitrile are mixed with tetrazole and 
sequentially coupled to the 5 '-hydroxy terminal nucleo- 
side bound to CPG. Nucleoside addition is followed by 
capping of unreached 5 T -hydroxyls with acetic anhy- 
dride, iodine oxidation, and 5 1 -detrttylation in tri- 
chloroacetic acid-methylene chloride. The resin-bound 
oligomer is then dried by extensive washing in anhy- 
drous acetonitrile and the process repeated. Normal 
cycle times using this procedure are 17 minutes with 
condensation efficiencies of >96% (as judged by trityl 
release). The terminal residue is conventionally added 
as a diester In order to facilitate radiolabeling and 
purification. The S'-terminal trityl group is left if 
HPLC purification is desired, but generally the 5 f - 
terminal trityl is removed and the Aminolink procedure 
described in Example 1 is used. 

At the end of the synthesis, the fully blocked 
product is base-deblocked as follows. The resin con- 
taining the fully protected DNA is removed from the 
column and placed in a water- jacketed chromatography 
column. The resin is then incubated in T-2 ml phenol: 
ethylene diamine (4:1) for 10 hours at 40°C. At the 
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end of the incubation in phenol : ethylene diamine, the 
resin is washed free of the phenol reagent and base 
protecting groups released using methanol, water, meth- 
anol and methylene chloride in succession. After 
drying in a stream of nitrogen, the intact, base- 
deblocked chains are cleaved from the support using 
EDArethanol (1:1) or a brief treatment at room tem- 
perature with ammonium hydroxide. 

Purification of the Aminolink DNA ethyl tri- 
ester product is then performed as follows- The 
material is first lyophilized several times from 50% 
aqueous ethanol and purified via reversed phase HPLC 
C-8 silica columns eluted with 5 to 50% acetoni trile/ 
25mM sodium acetate, pH 6.8 in a linear gradient. 
Amine-containing fractions as determined by fluores- 
camine reactivity are pooled and the product recovered 
by drying in vacuo and further purified by ion-exchange 
HPLC on Nucleogen DEAE 60-7 eluting 25% acetonitrile/25 
mM ammonium acetate, pH 6.5. 

The product oligonucleotide is then suitable 
for coupling to polyethylene glycol by any of the tech- 
niques previously described. In our experiments sev- 
eral techniques have been used, including SMCC, SPDP, 
carbonyldiimidazole , disuccinimidyl suberimidate and 
phenyl isocyanate. 

The SMCC/SPDP coupling reaction is as follows. 
The linker arm probe is coupled to excess SPDP followed 
by reduction with dithiothreitol (DTT), the unreacted 
DTT removed and the product allowed to cross-link 
through the free sulfhydryl to SMCC previously coupled 
to bis-(aminohexyl) polyethylene glycol (PEG) . The 
formation of the thioether linkage is rapid and selec- 
tive and the linkage formed is quite stable to a 
variety of conditions. The precise method of linkage 
formation is as follows: 

The DNA containing amine linker arms is re- 
acted with SPDP in a 100-1000 fold molar excess at pH 



WO 88/09810 



0 PCT/US88/02009 



10 



15 



20 



25 



30 



35 



30 

8.5 (0.1M sodium bicarbonate). After 1 hour at room 
temperature, the excess unreacted reagent is removed by 
G-25 column chromatography and the probe SPDP conjugate 
concentrated In vacuo . Bis-(aminohexyl ) polyethylene 
glycol is converted to the maleimide derivative as de- 
scribed in the previous example. The SPDP DNA triester 
is then treated with 10 mM mercaptoethanol dissolved in 
0.1M phosphate buffer (pH adjusted to 7*2) for 1 hour. 
This treatment serves to release the 5 1 thiopyridone 
thus forming a reactive sulfhydryl. Excess reducing 
agent is then removed using a G-25 spun column operated 
as previously described with the exception that all - 
separations are performed in degassed 0.1M phosphate 
buffer, pH 6.8 under a nitrogen atmosphere to prevent 
the reoxidation of the terminal SH. In this procedure 
it is essential that all excess reducing agent be re- 
moved in order to prevent its subsequent reaction with 
the. maleimidylated polyethylene glycol. 

Thiopyridone released in this procedure pro- 
vides a convenient indirect method for quantitating the 
presence of the. 5 1 -terminal SH. Thiopyridone released 
by reduction has a UV absorption at 3**3nm. By follow- 
ing the increase in absorbance of the. solution at this 
wavelength, the course of the reduction is easily fol- 
lowed. The thiopyridone can then be quant itated using 
a molar extinction coefficient of 8080. The product is 
then reacted overnight with a 10 fold molar excess of 
bis-(SMCC-aminohexyl) polyethylene glycol by addition 
of the latter as a powder or a concentrated solution to 
the solution containing the SH terminated oligomer tri- 
ester. The. reaction is allowed to proceed overnight at 
25°C." 

Purification of the complex is then effected. 
Unbound oligonucleotide is removed by gel filtration on 
Sephadex G-100 or HPGFC on TSK G4G00SW eluted with 10mM 
Tris, pH 7.5* The diagrammatic structure of the final 
porduct of this procedure is: 
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EXAMPLE 9 

Synthesis of Polyether Derivatives of DMA Alkyl and 
Aryltr jesters Using the Phosphate Triester Approach 
Synthesis of Phosphotr iester Oligonucleotides 

1 5 of Varying Alkyl and Aryl Substituent Type . 

The best available method for the production 
of the various triesters of variable alkane chain 
length- is Yia conventional phosphate triester chemistry 
to synthesize the desired sequences as the jv-chloro- 

20 phenyl phosphate triesters (PTE). Upon completion of 
the synthesis, the fully protected oligonucleotide 
chlorophenyltriesters bound to the synthesis support, 
are subjected to ester exchange in the presence of 
tetrabutylammonium fluoride and the desired alcohol* 

25 This basic method for the construction of .DNA oligonu- 
cleotides is classical DNA synthesis chemistry. See 
Gait, (198*1) Olignucleotide Synthesis; A Practical 
Approach , IRL Press, Washington, D.C* 

The chemical synthesis of DNA jv- or o-chloro- 

30 phenyl phosphotriesters was carried out using a modifi- 
cation of the phosphochloridite method of Letsinger 
Tanaka and Letsinger, Nucl. Acids Res. (1982) 25 :3249- 
60. For automated DNA synthesis, see Alvarado-Urbina 
et al. , Science (1981) 214:270-273. 

35 Fully blocked and dried nucleosides dissolved 

in anhydrous acetonitrile 2,6-lutidine and activated in 
situ with chlorophenoxydichlorophosphine are sequen- 



WO 88/098 10 ~ W PCT/US88/02009 

32 

tially added to the 5 '-hydroxy terminal nucleotide of 
the growing DNA chain bound to controlled pore glass 
supports via a succinate spacer as in previous exam- 
ples. Derivatized glass supports, fully blocked 
^ nucleosides and other synthesis reagents are commer- 
cially available through Applied Biosystems (San 
Francisco, CA) or American Bionuclear (Emeryville, CA). 
Nucleoside addition is followed by capping of unreacted 
5 T -hydroxyls with acetic anhydride, iodine oxidation, 
^ and 5 1 -detr itylation in trichloroacetic acid-methylene 
chloride. 

The resin bound oligomer chlorophenyltriester 
is then dried by extensive washing in anhydrous aceto- 
nitrile and the process repeated. Normal cycle times 
15 using this procedure are 13 minutes with condensation 
efficiencies of >92% (as judged by trityl release). 
The ultimate base may be added as a B-cyanoethyl phos- 
phtjtriester which yields, upon cleavage in base, a 5 f - 
terminal phosphodiester . This step makes it possible 
to radiolabel the oligonucleotide and to purify and se- 
quence the product using gel electrophoresis (Narang et 
al. , Can. J. Biochem. (1975) 11:392-4; Miller et al . , 
Biochemistry (1986) 25:5092-97. 

The fully blocked material bound to the syn- 
thesis support is then subject to ester exchange in the 
presence of tetrabutylammonium fluoride (TBAF) and the 
desired alcohol under anhydrous conditions. This 
method yields rapid and quantitative alcohol exchange. 
The reaction is complete within 20 minutes for most 
aryl and alkyl alcohols which are capable of forming 
stable products. 

In this example, anhydrous n-propanol is used 
to dissolve TBAF to a final concentration of 0.2M. The 

solution is then percolated slowly over the resin Con- 
or 

JJ taining the oligomer chlorophenyl triester and allowed 
to react for about 1 hour at room temperature. The 
resin is then washed with methanol and acetonitrile and 
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dried under a stream of dry argon. Amine linker arm 
addition, deblocking and purification are then effected 
as in Example 8. Polyethylene glycol conjugation is 
performed as in Example 7. The final yield of conju- 
^ gate is about 10? of the starting equivalents of nucle- 
oside resin used. The diagrammatic structure of the 
final product is: 



10 oligomer - P - 0 - (CH 2 ) 2 - NH - C - (CH ) -S 
PTE | 2 2 | 

OH C— C v 

/ N 

0 = C C = i 

15 I 

15 PEG (3500) 



EXAMPLE 10 
The Effect of Trityl Terminated 
Oligonucleotides on the Synthesis of g-globin 
Protein in vitro and in Cultured Cells 



20 



Using the methods of synthesis provided in the 
previous examples, both normal and ethyl triester type 
oligonucleotides were constructed. In the simplest 
example of an amphiphilic DNA conjugate containing a 

o 

hydrophobic grouping at the 5 T end of the molecule, the 
trityl group is left on at the end of the synthesis. 
Purified materials of this type were tested for their 
effectiveness in preventing the specific expression of 
hemoglobin in mouse ery throleukemia cells induced to 
produce hemoglobin. The oligonucleotides tested in 
these and the following examples are given in Table I. 
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The cells chosen for these experiments are 
Friend murine erythroleukemia (MEL) cells which can be 
induced to synthesize hemoglobin by a variety of agents 
5 including DMSO and butyric acid (cf . Gusella and 

Houseman, Cell (1976) 8:263-269. MEL cells are grown 
in culture using conventional techniques in a C0 2 
incubator . 

Induced cells which are expressing globin can 
10 be visualized by benzidine treatment which stains hemo- 
globin-producing cells blue (Leder et al . , Science 
(1975) 190:893. Cells were exposed to the selected 
oligonucleotide conjugates at concentrations ranging 
from 1 mg/ml to 1 yg/ml during induction. Controls 
15 included mock-treated cells and cells treated with 

random sequence oligomer controls. Treated cells were 
scored at various time intervals for globin production 
based on staining intensity and the results compared to 
controls. About 50% of the control cells are 
20 inducible. Cell death or damage due to treatment is 
scored by Trypan blue exclusion in order to obtain an 
indication of toxicity and cell damage. 

The results obtained are presented in Table 
II. These results show that the trityl terminated 
25 oligomers are more effective in producing the desired 
degree of synthesis inhibition. The trityl modified 
oligomers however showed some degree of cell damage 
which would not recommend their general use as 
therapeutic agents. 

30 
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EXAMPLE 11 

The Effect of Long Chain Alky! Terminated 
Oligonucleotides on the Synthesis of 
5 e-globin Protein in Cultured Cells 

Using the method of synthesis provided in the 
previous examples, 15 to 20 base long oligonucleotides 
conjugated to a 5' -terminal aminoalkane were construct- 
ed as described in Example 5. Purified materials of 
10 this type were tested for their effectiveness in pre- 
venting the specific expression of hemoglobin in MEL 
cells induced to produce hemoglobin. The results are 
given in Table III. The protocol for the test is given 
in Example 10. 

15 

TABLE III 

THE EFFECT OF INCREASING H YDROPHOB I C I T Y ON THE 
EFFECTIVENESS OF OLIGONUCLEOTIDES IN PREVENTING 
20 HEMOGLOBIN SYNTHESIS IN CULTURED CELLS 



Inhibition of 

Treatment Viable Cells Benzidine Cells 



25 



DMSO Control 






46% 


0% 


MBG-20 Antisense 


50 


pM 


50% 


41% 


MBG-20-C 2 


50 


vM 


61% 


41% 


MBG-20-C 6 


50 


yM 


60% 


48% 


MBG-20-C 10 


50 


liM 


62% 


66% 



*See Table I. 

3;0 

As shown in Table III, the results obtained 
indicate that the aminoalkane-terminated oligomers are 
more effective in producing the desired degree of se- 
lective synthesis inhibition than their cognate se- 
2^ quences lacking the terminal alkane . For example , the 
C 10 derivative was about 60% more effective than the 
control unmodified 20 raer in reducing the number of 
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.hemoglobin positive cells. In general , the longer the 
alkyl chain, the lower the concentration of oligomer 
required to effect the same % of inhibition. 

: 5 EXAMPLE 12 . 

The Effect of Fluorescein Terminated 
Oligonucleotides on the Synthesis of 
B-globin Protein in Cultured Cells 
Using the methods of synthesis provided in Ex- 
10 ample l f 15 to 20 base long oligonucleotides conjugated 
to a 5' -terminal fluorescein using ethylene diamine as 
the linker were constructed. This material has the 
further advantage that uptake of the oligomer into the 
cells can be monitored by fluorescence microscopy which 
15 provides further evidence of the cellular fate of the 
product. Purified fluorescent oligomers were tested 
for their effectiveness in preventing the specific ex- 
pression of hemoglobin in MEL cells induced to produce 
hemoglobin. The results are shown in Table IV. The 
20 protocol for the test is given in Example 10. 

TABLE IV 

THE EFFECT OF FITC CONJUGATION ON THE INHIBITION OF 
HEMOGLOBIN SYNTHESIS IN CULTURED CELLS 

25 

* % Inhibition of 

nilnnmor Viable Cells - — 



30 



35 



DMSO Control 






53% 0% 




MBG-20 Antisense 


50 


yM 


73% 35% 




MBG-2 0-C 2 -FITC 


50 


yM 


68% 45% 




MBG-20-C 6 -FITC 


50 


yM 


76% 36% 




MBG-20-C 10 -FITC 


50 


yH 


72% 52% 





*See Table I. 



As shown in Table IV, the results obtained in- 
dicate that the fluorescein-terminated oligomers are at 
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least as effective in producing selective inhibition of 
hemoglobin synthesis as their cognate control sequences 
lacking the PITC. Further, fluorescence microscopy of 
the treated cells showed enhanced fluorescence due to 
5 f luoresceinated oligomer uptake. These cells were then 
isolated, washed several times in physiological saline 
and lysed by freeze thawing several times in water. 
The resultant solution was centrifuged to remove cell 
debris and the amount of f luoresceinated oligomer pres- 

10 ent quantitated in an Aminco spectrof luorometer . The 
results obtained showed that the treated cells assim- 
ilated an average of 10 7 molecules of f luoresceinated 
oligomer per cell. This is about 10 times higher than 
cellular uptake of similar DNA oligomers (i.e lacking 

15 the solubility) moiety of about 10 6 molecules per cell. 

Thus it can be seen that the addition of a 
hydrophobic moiety, in this case fluorescein, to the 
oligomer results in substantially increased cellular 
uptake of the oligomer without affecting its ability to 

20 selectively block protein synthesis. 

EXAMPLE 13 

The Effect of Polyethylene Glycol Terminated 
Oligonucleotides on the Synthesis of B-globin 
25 Protein in Cultured Cells 

Using the methods of synthesis provided in the 
previous examples, 20 base long oligonucleotides conju- 
gated to a 5' -terminal polyethylene glycol were con- 
structed as described in Example 4. These molecular 
30 conjugates were purified and tested for their effec- 
tiveness in preventing the specific expression of 
hemoglobin as described in Example 10. 



35 
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TABLE V 

THE EFFECT OF POLYETHYLENE GLYCOL CONJUGATION ON THE 
INHIBITION OF HEMOGLOBIN SYNTHESIS IN CULTURED CELLS 



Oligomer A Viable Cells Inhibition of 



DMSO Control 






33% 


0% 


MBG-15 Antisense 


100 


pM 


50% 


25% 


MBG-15-C 2 


100 


pM 


60% 


22% 


PEG(ss) 


100 


pM 


43% 


24% 


MBG-20 + PEG(ss) 


100 


pM 


43% 


78% 


DMSO Control 






65% 


0% 


MBG-20-PEG(ss) 


15 


pM 


0% 


95% 




5 


pM 


62% 


52% 




1 


pM 


nd 


-2% 




0.1 


pM 


64% 


-5% 



*See Table I. 



As shown in Table V, the results, obtained show 

20 that oligomers conjugated to polyethylene glycol are 
more effective in producing the desired degree of 
selective synthe'sis inhibition than controls. The 
polyethylene glycol conjugate in this experiment was 
found to be approximately 10 times more active in pre- 

25" venting the expression of hemoglobin than the control 
combination of the 20 mer and polyethylene glycol. It 
is also interesting to note that the simple addition of 
polyethylene glycol to the medium increases the effec- 
tiveness of the added control antisense oligomer r in 

30 consonance with the increased effectiveness observed 
for the PEG conjugates. 

It is evident from the above results that the 
novel conjugates of the subject invention provide sub- 
stantial advantages in enhancing the efficiency in 

35 which transcriptional mechanisms may be modulated. In 
accordance with the subject invention, a wide variety 
of cellular, both prokaryotic and eukaryotic, as well 
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as viral, physiological processes may be regulated. 
The compositions can be used in vitro and in vivo . In 
the former, systems can be studied, mammalian cells 
protected from mycoplasma, phenotypes modified, and the 
5 like. In the latter, the compositions can be used for 
therapy in inhibiting the proliferation of pathogens, 
selectively inhibiting certain classes of cells, e.g., 
B-cells and T-cells, or the like. 

All publications and patent applications men- 

10 tioned in this specification are indicative of the 
level of skill of those skilled in the art to which 
this invention pertains. All publications and patent 
applications are herein incorporated by reference to 
the same extent as if each individual publication or 

15 patent application was specifically and individually 
indicated to be incorporated by reference. 

Although the foregoing invention has been de- 
scribed in some detail by way of illustration and ex- 
ample for purposes of clarity of understanding, it will 

20 be obvious that certain changes and modifications may 
be practiced within the scope of the appended claims. 
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WHAT IS CLAIMED IS: 



1. A method for inhibiting the maturation or 
translation of a messenger RUA in a cell, said method 
5 comprising: 

contacting said cell with a composition com- 
prising an oligonucleotide sequence complementary to a 
transcription product of said cell and a group cova- 
lently linked to said oligonucleotide sequence to 
10 provide an amphiphilic molecule, whereby said 

composition migrates into the cell interior resulting 
in the inhibition of maturation and/or translation of 
said transcription product. 

15 2. A method according to Claim 1, wherein 

said cell is in culture and said composition is 
introduced into the nutrient medium. 

3. A method according to Claim 1, wherein 
said oligonucleotide is of from about 6 to 30 
nucleotides. 



20 



4. A method according to Claim 3, wherein at 
least one of said oligonucleotides has a phosphate as 

25 the phosphorus moiety. 

5. a method according to Claim 3, wherein at 
least one of said oligonucleotides has a phosphonate 
with an alkyl group of from 1 to 3 carbon atoms as the 

30 phosphorus moiety. 

6. A method according to Claim 1, wherein 
said group is a hybridphobic aromatic group. 



35 



7. A method according to Claim 7 , wherein 
said aromatic group is a trityl group. 
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8. A method according to Claim 7, wherein 
said aromatic group is a fluorescein group* 

9. A method according to Claim 1/ wherein 
5 said group is a polyalkyleneoxy group, wherein said 

alkylenes are of from 2 to 10 carbon atoms. 

10. A method according to Claim 9, wherein 
said polyalkyleneoxy group is from about 6 to 200 

10 units. 

11. A cell comprising a composition com- 
prising an oligonucleotide sequence complementary to a 
transcription product of said cell and an amphiphilic 

15 or hydorphobic group covalently linked to said 

oligonucleotide sequence to provide an amphiphilic 
molecule. 

12. A cell according to Claim 11/ wherein 
20 said cell is in culture. 

13. A composition of matter comprising: 

an oligonucleotide sequence of at least six 
nucleotides complementary to a transcriptional product 
25 of a cell; 

an amphiphilic group comprising a polyalkyl- 
eneoxy group, wherein said alkylenes are of from 2 to 
10 carbon atoms; 

a linker of at least one atom covalently 
30 bonded to said oligonucleotide sequence and to said 
amphiphilic group. 

14. A composition of matter according to 
Claim 13, wherein said oligonucleotide is of from about 

35 6 to 30 nucleotides. 
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15. A composition of matter according to 
Claim 13, wherein at least one of said oligonucleotides 
has a phosphate as the phosphorus moiety. 

16. A composition of matter according to 
Claim 13 , wherein at least one of said oligonucleotides 
has a phosphonate with an alkyl group of from 1 to 3 
carbon atoms as the phosphorus moiety. 

17* A composition of matter according to 
Claim 13, wherein said linking group includes at least 
one of an amino, quinone, thioether, or amide group. 

18. A composition of matter according to 
Claim 13, wherein said oligonucleotide sequence is 
complementary at least in part to a non-coding region. 

19. A composition of matter according to 
Claim 13, wherein said oligonucleotide sequence is 
complementary at least in part to a coding region. 
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Identification of the Cystic Fibrosis Gene: 
Cloning and Characterization of 
Complementary DNA 



John R. Riordan, Johanna M. Rommens, Bat-sheva Kerem, Noa Alon, 
Richard Rozmahel,Zbyszko Grzelczak, Julian Zielenski, Si Lok, 
Natasa Plavsic, Jia-Ling Chou, Mitchell L. Drumm, Michael C. Iannuzzi, 
Francis S. Collins, Lap-Chee Tsui 



Overlapping complementary DNA clones were isolated 
from epithelial cell libraries with a genomic DNA seg- 
ment containing a portion of the putative cystic fibrosis 
(CF) locus, which is on chromosome 7. Transcripts, 
approximately 6500 nucleotides in size, were detectable 
in the tissues affected in patients with CF. The predicted 
protein consists of two similar motifs, each with (i) a 
domain having properties consistent with membrane as- 
sociation and (ii) a domain believed to be involved in ATP 
(adenosine triphosphate) binding. A deletion of three 
base pairs that results in the omission of a phenylalanine 
residue at the center of the first predicted nucleotide- 
binding domain was detected in CF patients. 



CYSTIC FIBROSIS (CF) IS AN AUTOSOMAL RECESSIVE GENET- 
IC disorder affecting a number of organs, including the lung 
airways, pancreas, and sweat glands (1). Abnormally high 
electrical potential differences have been detected across the epidieli- 
al surfaces of the CF respiratory tract, including the trachea and 
nasal polyps, as well as across the walls of CF sweat gland secretory 
coils and reabsorptive ducts (2). The basic defect has been associated 
with decreased chloride ion conductance across the apical membrane 
of the epithelial ceils (3). That the defect also appeared to persist in 
cultured cells derived from several epithelial tissues suggested that 
the CF gene is expressed in these cells (4). More recendy, patch 
clamp studies showed that this defect is probably due to a failure of 
an outwardly rectifying anion channel to respond to phosphoryl- 
ation by cyclic AMP-dependenr protein kinase (protein kinase A) or 
protein kinase C (5). Although progress has been made in the 
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isolation of polypeptide components of an epithelial chloride chan- 
nel that mediates conductance (6), dieir relation to the kinase- 
activated pathway and CF has yet to be established, and the basic 
biochemical defect in CF remains unknown. 

Molecular cloning experiments have permitted the isolation of a 
large, contiguous segment of DNA spanning at least four tran- 
scribed sequences from a region thought to contain the CF locus 
(7). These sequences were initially identified on the basis of their 
ability to detect conserved sequences in other animal species by 
DNA hybridization and were subsequently characterized by RNA 
hybridization experiments,. cDNA isolation, and direct DNA se- 
quence analysis (7). Three of the transcribed regions were excluded 
from being the CF locus by earlier genetic or DNA sequence 
analyses (1, #). The fourth one, as shown by genetic analysis (9) and 
DNA sequencing analysis presented below, corresponds to a por- 
tion of the CF gene locus. . 

Isolation of cDNA clones. Two DNA segments (E4.3 and 
HI. 6) that detected cross-species hybridization signals (7) were used 
as probes to screen cDNA libraries made from several tissues and cell 
types (10). After screening seven different libraries, one single clone 
(10-1) was isolated with HI .6 from a cDNA library made from the 
cultured epithelial cells of the sweat glands of an unaffected (non- 
CF) individual {10). 

DNA sequencing showed that 10-1 contained an insert of 920 
base pairs (bp) in size and one potential, long open reading frame 
(ORF). Since one end of the sequence shared perfect sequence 
identity with HI. 6, it was concluded that the cDNA clone was 
probably derived from this region. The DNA sequence in common 
was, however, only 113 bp long (Figs. 1 and 2). This sequence in 
fact corresponded to the first axon of the putative CF gene. The 
short sequence overlap thus explained the weak hybridization signals 
in library screening and our inability to detect transcripts in RNA 
gel-blot analysis. In addition, the orientation of the transcription 
unit was tentatively established on the basis of alignment of the 
genomic DNA sequence with the presumptive ORF of 10-1. 

Since the corresponding transcript was estimated to be about 
6500 nucleotides in length by RNA gel- blot hybridization experi- 
ments, further cDNA library screening was required in order to 
clone die remainder of die coding region. As a result of several 
successive screenings with cDNA libraries generated from the colon 
carcinoma cell line T84, normal and CF sweat gland cells, pancreas, 
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and adult lungs, 18 additional clones were isolated (Fig. 1). DNA 
sequence analysis revealed that none of these cDNA clones corre- 
sponded to the length of the observed transcript, but it was possible 
to derive a consensus sequence based on overlapping regions. 
Further cDNA clones corresponding to the 5' and 3' ends of the 
transcript were derived from 5' and 3' primer-extension experiments 
(Fig. 1). Together, these clones span about 6.1 kb and contain an 
ORF capable of encoding a protein of 1480 amino acids (Fig. 2). 

It was unusual that most of the cDNA clones isolated here 
contained sequence insertions at various locations (Fig. 1). While 
many of these extra sequences corresponded to intron regions 
reverse-transcribed during the construction of the cDNA, as re- 
vealed on alignment with genomic DNA sequences, the identities of 
several others were uncertain because they did not align with 
sequences at the corresponding exon-intron junctions, namely, the 
sequences at the 5' ends of clones 13a and T16-1 and at the 5' and 
3' ends of Tl 1, and the insertions between exons 3 and 4 in 13a and 
between exons 10 and 11 in T16-4.5 (legend to Fig. 1). More 
puzzling were the sequences corresponding to the reverse comple- 
ment of exon 6 at the 5' end of 1 la and the insertion of a segment of 
a bacterial transposon in clone C16-1; none of these could be 
explained by mRNA processing errors. 

In diat die number of recombinant cDNA clones for the putative 
CF gene detected in the library screening was much less than would 
have been expected from the abundance of transcripts estimated 
from RNA hybridization experiments, it seemed probable that the 
clones diat contained aberrant structures were preferentially retained 
while the proper clones were lost during propagation. Consistent 
with this interpretation, poor growth was observed for most of our 
recombinant clones isolated, regardless of the vector used. 

RNA analysis. To visualize the transcript of the putative CF 
gene, we used RNA gel-blot hybridization with the 10-1 cDNA as 



the probe (Fig. 3). The analysis revealed a prominent band, about 
6.5 kb in size, in T84 cells. Identical results were obtained with 
other cDNA clones as probes. Similar, strong hybridization signals 
were also detected in pancreas and primary cultures of cells from 
nasal polyps, suggesting that the mature mRNA of the putative CF 
gene is about 6.5 kb. Minor hybridization signals, probably repre- 
senting degradation products, were detected at the lower size 
ranges, but they varied between different experiments. On the basis 
of die hybridization band intensity and comparison with those 
detected for other transcripts under identical experimental condi- 
tions, it was estimated that the putative CF gene transcripts 
constituted about 0.01 percent of total mRNA in T84 cells. 

Additional tissues were analyzed by RNA gel- blot hybridization 
in an attempt to correlate the expression pattern of the putative CF 
gene and the pathology of CF. Transcripts, all of identical size, were 
found in lung, colon, sweat glands (cultured epithelial cells), 
placenta, liver, and parotid gland, but the signal in these tissues was 
generally weaker than that detected in the pancreas and nasal polyps 
(Fig. 3). Intensity varied among different preparations; for example, 
hybridization in kidney was not detectable in the preparation shown 
in Fig. 3 but was clearly discernible subsequently. Transcripts were 
not detected in the brain or adrenal gland, nor in skin fibroblast and 
lymphoblast cell lines. 

Thus, expression of the putative CF gene appeared to occur in 
many of the tissues examined, with higher levels in those tissues 
severely affected in CF. While this epithelial tissue-specific expres- 
sion pattern is in good agreement with the disease pathology, no 
significant difference was detected in die amount or size of tran- 
scripts from CF and control tissues (Fig. 3), consistent with the 
assumption that CF mutations are subtle changes at the nucleotide 
level. 

Characterization of cDNA clones. As indicated above, a contig- 
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Fig. 1. Overlapping cDNA clones 
aligned with genomic DNA frag- 
ments. The cDNA clones arc repre- 
sented by open boxes with exons 
indicated. The corresponding ge- 
nomic Eco RI fragments are sche- 
matically presented on the bottom, 
with lengths in kilobases. The 
hatched boxes denote intron se- 
quences, and stippled boxes repre- 
sent other sequences as outlined 
below. The filled box in the lower 
left is the position of the clone 
HI. 6, which was used to isolate the 
first cDNA clone 10-1 from a nor- 
mal (N) sweat gland library (10). 
The definitive restriction sites used 
for the alignment of cDNA and 
genomic fragments are indicated. 
Clones T6, T6720, Til, T16-1, 
T13-1, T16-4.5, T8-B3, and T12a 
were isolated sequentially from the 
T84 cell library (70). Clones isolat- 
ed from the human lung cDNA 
library (10) are designated with the 
prefix CDL. CDPJ5 is derived from 
a pancreas library (10). The CF 
sweat gland cDNA clones, CI 6-1 
and CI- 1/5, together cover all but 
exon 1 and a portion of the 3' 
untranslated region. Both clones re- 
vealed a 3 -bp deletion in exon 10. 
Clones that contain intron sequences are CDLS26-1, T6720, and T13-1. 
Clones Til, T16-4.5, CDLS16A, 11a, and 13a contain extraneous se- 
quences of unknown origin at positions indicated. Clone CI 6-1 contains a 
short insertion corresponding to a portion of the y transposon of E. colt. 
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Both PA3-5 and TB2-7 are 5' extension clones generated from pancreas and 
T84 RNA by the anchored PCR technique (/2), respectively. THZ-4 is a 3' 
extension done obtained from T84 RNA. Both Tl 2a and THZ-4 contain a 
poryadenyiation signal and a pofy(A)- tail. 
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K S L CE 



V T P t • A R v m 876 



AAGACCTTAATTTTT<*rGCTAATTTCGTGCTTAGTAATTT7TCTGCCAGAGGTGGCTGCT 



V V 1 LWLLGFItP LQDKGN ST 896 



TCTTTGGTTGTGCTGTCGCTCCTTGGAAAfcACTCCTCTTCAAGACAAAGGGAATAGTACT 



HSRNNSYAV1 1TSTS Is V y V~T1 »i 6 
CATAGTAGAAATAACAGCTATGCAGTGATTATCACCAGCACCAGTTCGTATTATGTGTTT 

1 Y I Y V G V A D T L L A - H K F ~F*1 R G L P 936 
TACATTTACGTGGGAGTAGCCGACACTrTGCTTGCTATGGCATTCTTCAGAGGTCTACCA 

LVHTL1 TVSK ILUHXHLHSV 956 
CTGGTGCATACTCTAATCACAGTGTCGAAAATTTTACACCACAAAATGTTACATTCTGTT 

LCAPMSTLHTLXAfcCILNRF 976 
CTTCAAGCACCTATGTCAACCCTCAACACGTTGAAAGCAGbTGGGATTCTTAATAGATTC 



S K D I A I LDDLLPL T [ Z F D F I ~Ol 996 
TCCAAAGATATAGCAATTTTGGATGACCTTCTGCCTCTTACCATA?TTGACTTCATCCAd 



I A V V A V lj Q P i Y I f! i016 



TTGT7ATTAATTGTGATTGGAGCTATAGCAGTTGTCGCAGTTTTACJIACCCTACATCTTT 

IVATVPVTVAFIHI. RAYFT.I Q 7 1036 
GTTGCAACAGTGCCAGTGATAGTGGCTTTTATTATGTTGAGAGCATATTTCCTCCAAACC 

SOQLX0LCSEGRSP! FTHLV 1056 
TCACAGCAACTCAAACAACTGGAATCTGAAGGCAGGAGTCCAATTTTCACTCATCTTGTT 
• • 

TS LKGLWTLRAFGROP YFET 1076 
ACAAGCTTAAAAGGACTATGGACACTTCGTGCCTTCGGACGGCAGCCTTACTTTGAAACT 

LFHKALWLHTANWFLYLSTL 1096 
CTGTTCCACAAAGCTCTGAATTTACATACTGCCAACTCGTTCTTGTACCTGTCAACACTG 



F O M R 1 I E H 1 " 



CGCTGGTTCCAAATGAGAAT AGAAAT GA' 



I F F I A V T Fl 



A<£;AC 



: Ll 1156 

;crrfc • 



T-GTCATCTTCTTCATTGCTGTTACCTTC 

rtVGl 1 LTLAl U36 

ATTTCCATTTTAACAACAGfcAGAAGGAGAAGGAAGAGTTGGTATTATCCTGACTTTAGCC 

iMNIMSTLOWAVMRrf I D V D _ 
ATGAATATCATGAGTACATTGCAGTGGGCTGriAAACTCCAGCArAGATGTGGATAG! 

HRSVSRVF KF I DMPTEGKPT 1176 

atgcgatctgtgagccgagtctttaagttcattgacatgccaacagaaggtaaacctacc 

• 

XSTXPYXWGOLSXVHI1ENS 1196 
AAGTCAACCAAACCATACAAGAATGGCCAACTCTCGAAAGTTArGATTATTGAGAATTCA 

HVKXDDIttPSGGQMTVKDLT 1216 
CACGTGAAGAMCATGACATCTGGCCCTCAG«K^CCAAATGACTGTCAAAGATCTCACA 



A X T T a Q Q M A 



3 F 3 I fl 



3781 GCAAAATACACAGAAGGTGGAAATGCCATATTAGAGAACATTTCCTTCTCAATAAGTCCT 



3841 
3901 
3961 
4021 
4081 
4141 
4201 
4261 
4321 
4381 
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;tggg cct ctt gggaag AACTGGATCAGGGAAGAGTACT ttgtt a tc agct 

rL RLLWTKGK I QID0V8WDS 1276 

Tl i jigagactactgaacactgaagga^»aaatccagat<x^t<xjtgtgtcttgggattca 

ITLOQURKAFC VIP 0 I K V T I 9 1296 
ATAACTTTCCAACAGTCGACGAAAGCCTTTGCAGTX»TACCACAG 

SGTFRKWLDF T EOWSDQEIg 1316 
TCTGGAACA1TTAGAAAAAACTTGCATCCCTATCAACAGTGGAGTCATCAAGAAA 

K ? A D K 1 V CLRSV IKOFPCKLD 1336 

aaagtt«:agatgagctggcctcagatctgtgatagaacagtttcct 

rVLV POO CVL»BCBIEOI.MCL 13S6 

tttgtccttgtggatgggggctgtctcctaacccatggocacaagcactt^3atx;tgcttg 

A R ? Y h SXA XILLLPFPSAMI. 1376 

gctagatctgttctcagtaaggcgaagatc^gctgcttcatg^ 

P p M- y T Q I I ft. ft TLKOAFADCT 1396 

GATCCAG-TKACATACCAAATAATTACAAGAACTCTAAAACAAGCATTtGCT^ 

VILCEHRIEAMLECQOF L|V I 1416 
GTAATTCTCTCTGAACACACGATAGAACCAATGCTCGAATGCCAACAAY7 TTTtLl'CATA 



EENKVRQYD5 I OKLLNERSL 1436 
GAAGAGAACAAAGTGCGGCAGTACGATTCCATXXAGAAACKX^GAACGAGAGGA 

FRQAI SPSDRVKLFPHRHSS 1456 
^CCCGCAAGCCATCAGCCCCTCCGACAGGffTGAAGCTCTTTCCCCA^CGGAACTCAACC 

XCKSXPOIAALKEETEEEVQ 1476 
AAGTGCAAGTCTAAGCCCCAGATTGCTGCTCTGAAAGAGGAGACAGAACAAGAGGTGCAA 
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0 T R L - l* 80 
4 561 GATACAAGGCTTTAGAGAGCAGCATAAATGTTGACATGGGACATTTGCTCATGGAATTGG 
4621 AGCTCGTGGGACAGTCACCTCATGGAATTCGAGCTCCTGGAACAGTTACCTCTGCCTCAG 
4 681 AAAACAAGGATGAATTAAGT ITIT niTAAAAAAGAAACATTTGGTAAGGGGAATTGAGG 
4141 ACACTGATATGGGTCTTGATAAATGGCTTCCTGGCAATAGTCAAATTGTCTGAAAGGTAC 
4 801 TTCAAATCCTTGAAGATTTACCACTTGTGTTTT<XAAGCCAGATTTTCCTGAAAACCC^ 
4861 GCCATGIGCTAGTAATTGGAAAXX3CAGCTCTAAATGTCAATCAGCCTAGTTGATCAGCTT 
4921 ATTGTCTAGTGAAACTCGTTAATTTGTA(ntr?TGt^GAAGAACTGAAATCATACTTCTTA 
4981 GGGTTAIGAT TAAGTAATGATAACTGGAAAC7TCAGCGGTTTATATAAGCTTCJTATTCCT 
5041 TTTTCTCTCCTCTCCCCATGATGTTTAGAAACACAACTATATTGTTTGCTAAGCATTCCA 
5101 ACTATCTCATTTCCAAGCAAGTATTAGAATACCACAGGAACCACAAGACTGCACATCAAA 
5161 ATAT^CCCATTCAACATCTAGTGAGCAGTCAGGAAAGAGAACTTCCAGATCCTGGAAAT 
5221 CAGGGrEAGTATTGTCCAGGTCTACCAAAAAT C7CAATATTTCAGATAATCACAATACAT 
52 8 1 CCCTTACCTGGGAAAGGGCTGTTATAATCTTT CACAGGGGACAGGATCGTTCCCTTGATG 
5341 AAGAAGTTGATATGCCTTTTCCCAACTCCAGnAAGTGACAAGCTCACAGACCTTTGAACT 
5401 AGAGTTr ACCTGGAAAAGTATGTTAGTGCAAA rTCTCACAGGACAGCCCTTCT TICCACA 
54 61 GAAGCTCCAGGTAGAGGGTG TGTAAGT AGAT A GGCC ATGGGCACTGTGGGT AGACA CACA 
5521 TGAAGTCCAAGCATTTAGATGTATAGGTTGA7GCTGCTATGTTTTCAGCCTAGATGTATG 
5581 TACTTCATGCTGTCTACACT AAGAGAGAATGAGACACACACTCAACAAGCACCAATCATG 
5641 AATTAGTTTTATATGCTTCTGTTTTATAATTrTGTCAAGCAAAATTTTTTCTCTAGGAAA 
5701 TATTTATTTT AAT AATGTTT CAAACATATATTACAATCCTGTATT tTAAAACAATGATTA 
5761 TGAATTACATTTGTATAAAATAATTTTTATArrrrGAAATATTGACTTTTTATGGCACTAG 
582 1 TATTTTTATGAAATATTATGTTAAAACTGGGACAGGGGAGAACCTAGGGTGATATTAACC 
5881 AGGGGCCATGAATCACCTTTTGGTCTGGAGGCAAGCCTTGGGGCTGATCGAGTTGTTGCC 
5941 CACAGCTGTATGATTCCCAGCCAGACACAGCCrCTTAGATGCAGTTCTGAAGAAGATCGT 
6001 ACCACCAGTCTGACTGTTTCCATCAAGGGTACACTGCCTTCTCAACTCCAAACTGACTCT 
6061 TAAGAAGACTGCATTATATTTATTACTGTAA3AAAATATCACTTGTCAATAAAATCCATA 
6121 CATTTGTGT <A) n 

Fig. 2. Nucleotide sequence of cDNA encoding the GF transmembrane 
conduaance regulator together with the deduced amino acid sequence. 
DNA sequencing was performed by the dideoxy chain termination method 
(34) with 3 5 S- labeled nucleotides or by the Dupont Genesis2000 automatic 
DNA sequencer. Numbers on the left of columns indicate base positions and 
numbers on the right amino acid residue positions. The first base position 
corresponds to the first nucleotide in the 5' extension clone PA3-5, which is 
one nucleotide longer than TB2-7 (12). The 3' end and the noncoding 
sequence are shown above [nucleotides 4561 to 6129 plus the poly(A) 
tail]. Arrows indicate position of transcription initiation site by primer 
extention analysis (11). Nucleotide 6129 is followed by a poly(A) tract. 
Positions of exon junctions are indicated by vertical lines. Potential mem- 
brane-spanning segments ascertained with the use of the algorithm of 
Eiscnbcrg et at. (35) are enclosed in boxes. Amino acids comprising putative 
ATP-binding folds arc underlined. Possible sites of phosphorylation (27) by 
protein kinases A or C are indicated by open and closed circles, respectively. 
The open triangle indicates the position at which 3 bp are deleted in CF. 
Abbreviations for the amino acid residues arc: A, Ala; C, Cys; D, Asp; E, 
Glu; F, Phc; G, Gly; H, His; I, He; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; 
Q, Gin; R, Arg; S, Ser; T, Thr, V, Val; W, Trp; and Y, Tyr. 
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Fig. 3. RNA gel-blot analysis. Hybridization by the cDNA done 10-1 to a 
6,5-kb transcript is shown in the tissues indicated. RNA samples were 
prepared from cells and tissue samples obtained from surgical pathology or at 
autopsy according to the methods described in (10). Total RNA (10 u.g) 
from each tissue and 1 u.g of poly(A) + RNA from T84 cells were separated 
on formaldehyde gels and transferred onto nylon membranes (Zetaprobe, 
Bio-Rad), which were hybridized with DNA probes labeled to high specific 
activity by the random priming method (36, 37). The positions of the 28S 
and 185 rRNA bands are indicated. 



uous coding region of the CF locus could be deduced from 
overlapping cDNA clones. Since most of the cDNA clones were 
apparently derived from unprocessed transcripts, further studies 
were performed to ensure the authenticity of the consensus se- 
quence. Each cDNA clone was first tested for chromosome localiza- 
tion by hybridization analysis with a human-hamster somatic cell 
hybrid containing a single human chromosome 7 and by pulsed field 
gel electrophoresis (7). The ones that did not map to the correct 
region on chromosome 7 were not pursued. Fine restriction enzyme 
mapping was then performed for each clone. While overlapping 
regions were clearly identified for most of the clones, many con- 
tained single copy, additional regions not readily recognizable by 
restriction enzyme analysis. 

The cDNA was further characterized in gel hybridization experi- 
ments with genomic DNA. Five to six different restriction fragments 
could be detected with the 10-1 cDNA in Eco RI- or Hind in- 
digested total human DNA and a similar number of fragments with 
several other cDNA clones, suggesting the presence of multiple 
exons for the putative CF gene. The hybridization studies also 
identified the cDNA clones with unprocessed intron sequences 
when they showed preferential hybridization to a smaller subset of 
genomic DNA fragments with relatively greater intensities. For the 
confirmed cDNA clones, their corresponding genomic DNA seg- 
ments were isolated (7) and the exons and exon-intron boundaries 
were sequenced. In all, 24 exons were identified (Fig. 2). Physical 
mapping experiments (7) showed that the gene locus spanned about 
250 kb. 

The 5' terminus of the transcript was determined by primer 
extension (11). A modified polymerase chain reaction, anchored 
PCR (22), was also used to facilitate cloning of the 5' end sequences. 



T16 C16 T16 C16 




B primer D primer 

Fig. 4. DNA sequence around the AF 50 s deletion. The normal sequence 
from base position 1627 to 1651 (from cDNA T16-1) is shown beside the 
CF sequence (from cDNA CI 6-1). The left panel shows the sequences from 
the coding strands obtained with the B primer (S'-GTTTTCCTGGAT- 
TATGCCTGGGCAC-3 ' ) and the right panel those from the opposite strand 
with the D primer (5 '-GTTGGCATGCTTTG ATGACGCTTC-3 ') . The 
brackets indicate the three nucleotides in the normal that arc absent in CF 
(arrowheads). Sequencing was performed as described in (34). 

Two independent 5' extension clones, one from pancreas and the 
other from T84 RNA, were characterized by DNA sequencing and 
differed by only 1 base in length, thus establishing the most 
probable initiation site for the transcript (Fig. 2). Since the initial 
cDNA clones did not contain a po!y(A) + tail indicative of the end of 
a mRNA, anchored PCR was also applied to the 3' end of the 
transcript (12). The results derived from the use of several different 
3' -extending oligonucleotides were consistent with the interpreta- 
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tion that the end of the transcript was about 1.2 kb downstream of 
the Hind III site at nucleotide position 5027 (Fig. 2). 

The complete cDNA sequence spans 6129 base pairs excluding 
the poly(A) + tail at the end of the 3' untranslated region and it 
contains an ORF capable of encoding a polypeptide of 1480 amino 
acids (Fig. 2). An ATG (AUG) triplet is present at the beginning of 
this ORF (base position 133-135). Since the nucleotide sequence 
surrounding this codon (5'-AGACCAUGCA-3') has the proposed 
features of the consensus sequence (CC)c.CCAUGG(G) of a cu- 
karyotic translation initiation site (13), with a highly conserved 
A at the -3 position, it is highly probable that this AUG corre- 
sponds to the first methionine codon for the putative polypeptide. 

Detection of mutation. A comparison between the cDNA 
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Fig. 5. Hydropathy profile and predicted secondary structures of the CFTR. 
(A) The mean hydropathy index determined according to Kyte and Doolitrie 
( 19) of nine-residue peptides is plotted against the amino acid number. (B) 
The corresponding positions of features of secondary structure predicted 
according to Garnier et al (19). C, coil; T, turn; S, sheet; H, helix. 



sequences derived from CF and unaffected (N) individuals was next 
conducted. Two clones, C16-1 and CM/5, were derived from a CF 
sweat gland cDNA library and together they spanned almost the 
entire coding region. The most striking difference between CF and 
N sequences was a 3-bp deletion (Fig. 4), which would result in a 
bss of a phenylalanine residue (position 508) in the predicted CF 
polypeptide. This deletion (AF 5 os) was detected in both CF clones. 
To exclude the possibility that this difference was due to a cloning 
artifact, sequence-specific oligonucleotides were used to screen 
DNA samples from CF families. Specific hybridization could be 
observed for each oligonucleotide probe with genomic DNA ampli- 
fied by PCR, confirming the presence of corresponding genomic 
DNA sequences (9). Furthermore, the oligonucleotide specific for 
the 3-bp deletion hybridized to 68 percent of chromosomes carrying 
a CF mutation but not to any of the normal chromosomes (0/198), 
an indication that a silent sequence polymorphism was unlikely. 
Sequence differences found elsewhere among the different cDNA 
clones probably represented sequence polymorphisms or cDNA 
cloning artifacts (14), 

Predicted protein structure. Analysis of the sequence of the 
overlapping cDNA clones (Fig. 2) predicted a polypeptide of 1480 
amino acids with a molecular mass of 168,138 daltons. The most 
characteristic feature of the predicted protein is the presence of two 
repeated motifs, each of which consists of a domain capable of 
spanning the membrane several times and sequences resembling 
consensus nucleotide (ATP)-binding folds (NBF's) (15) (Figs. 5 and 
6). These characteristics are remarkably similar to those of the 
mammalian multidrug resistance P-glycoprotein (16) and a number 
of other membrane-associated proteins (as discussed below), sug- 
gesting that the predicted CF gene product is likely to be involved in 
the transport of substances (ions) across the membrane and is 
probably a member of a membrane protein superfamily (17). For the 
convenience of future discussion and to avoid confusion with the 
previously named CF protein and CF factor (18), we will call the 
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FSLLGTPVLKDINFKIERGQLLAVAGSTGAGKTSLLMMIMG 
YTEGGNAILENISFSISPGQRVGLLGRTGSGKSTLLSAFLR 
PSRKEVKILKGLNLKVQSGQTVALVGNSGCGKSTTVQLMQR 
PTRPDIPVLQGLSLEVKKGQTLAI.VGSSGCGKSTWQLLER 
PSRSEVQILKGLNLKVKSGQTVALVGNSGCGKSTTVQLMQR 
PTRPN IP VLQGLSLEVKKGQTLALVGS SGCGKSTWQLLER 
PSRANIKILKGLNLKVKSGQTVALVGNSGCGKSTTVQLLQR 
PTRANVPVLQGLSLEVKKGQTLALVGSSGCGKSTWQLLER 
DTRKDVEIYKDLSFTLLKEGKTYAFVGESGCGKST ILKLI E 
I S RPN VP I YKNLSFTCD SKKTTAIVGETGSGKS TFMNLLL R 
PSRPSEAVLKNVSLNFSAGQFTFIVGKSGSGKSTLSNLLLR 
PSAPTAFVYKNMNFDMFCGQTLGIIGESGTGKSTLVLLLTK 
YKPDSPVJLDNINISIKQGEVIGIVGRSGSGKSTLIKLIQR 
I P APRKH LLKNVCGVAY PGELLAVMGS SGAGKTTLLNALAF 
KS LGNLKI LDRVSLYVP KFSLIALLGPSGSGKSSLLRILAG 
QDVAESTRLGPLSGEVRAGRILHLVGPNGAGKSTLUUUAG 
FYYGKFHALKNINLDTAKNQVTAFIGPSGCGKSTLLRTFNK 
RRYGGHEVLKGVSLQARAGDVI SIIGS SGSGKSTFLRCINF 
KAWGEWVSKDINIDIHEGEFWFVGPSGCGKSTLLRMIAG 
TP DGDVTAVN DLNFTLRAGE TLG I VGE SGS GKSQTAFALMG 
QPPKTLKAVDGVTLRLYEGETLGWGESGCGKSTFARAIIG 
KAVPGVKALSGAALNVYPGRVMALVGE NGAGKS TMMKVLTG 
VDNLCGPGVNDVSFTLRKGEILGVSGLMGAGRTELMKVLYG 
LTGARGNNLKDVTLTLPVGLFTCITGVSGSGKSTLINDTLF 
KSYGGKIWNDLSFTIAAGECFGLLGPNGAGKSTIIRMILG 
AYIiGGRQALQGVTFHMQPGEMAFLTGHSGAGKSTLLKLICG 



ISFCSQFSHIMPGTIK-ENIIFGVSYD 
DS I TLQQWRKAFGVI PQKVFIFSGTFR 
I GWS QE P VLFATTI -AEN I RYGRENV 
LGTVS QE P I L FDCS I -AEN I AYGDNS R 
IGWSQEPVLFATTI -AENI RYGREDV 
LGEVSQEP ILFDCSI -AENIAYGDNSR 
IGWSQEPVLSFTTI -AENIRYGRGNV 
LGIVSQEPILFDCS I -AENIAYGDNSR 
IGWSQDPLLFSNSI -KNNIKYSLYSL 
FS IVSQEPMLFNMS I - YENI KFGREDA 
ITWEQRCTLFNDTL -RKNI LLGSTDS 
ISWEQKPLLFNGTI -RDNLTYGLQDE 
VGWLQDNVLLNRSI-IDNISLAPGMS 
RCAYVQQDDLFIGLIAREHLIFQAMVR 
MSFVFQH YALFKHMTVYENI SFGLRLR 
YLSCjCiQT PP FAT P VWH Y LTL HQHDKTR 
VGMVFQKPTPFPMS I -YDNI AFGVRLF 
G I KVFQH FNLWS HMTVL ENVMEAP I QV 
VGMVFQSYALYPHLSVAENMSFGLKPA 
ISMI FQDPMTSLNP YMRVGEQLMEVX^M 
IQMIFCJDPLASLNPRMTIGEIIAEPLR 
AGI IHQELNLIPQLTIAENIFLGREFV 
ISEDRKRDGLVLGMSVKENMSLTALRY 
TYTGVFTPVRELFAGVPESRARGYTPG 
IGIVSQEDNLDLEFTVRENLLVYGRYF 
IGMI FQDHHLLKORTVYDNVAIPLI IA 



GEGGI TLSGGQRARI SLARAVYKDADLYLLDS PFGYLDVLTEK 
VDGGCVLSHGHKQLMCLARSVLSKAKILLLDEPSAHLDPVTYQ 
GERGAQLSGGQKQRIAIARALVRNPKILLLDEATSALDTESEA 
GDKGTLLSGGQKQRIAIARALVRQPHILLLDEATSALDTESEK 
GERGAQLSGGQKQRIAIARALVRNPKILLLDEATSALDTESEA 
GDKGTQLSGGQKQRIAIARALVRQPHILLLDEATSA1DTESEK 
GDRGAQLSGGQKQRIAIARALVRNP KI LLL DEAT SAL DTESEA 
GDKGTQLSGGQKQMAIARALIRQPRVLLLDEATSALDTESEK 
GS NAS KLSGGQKQRI S I ARAIMRNP K I LI L DEATS SL DNKS E Y 
PYGKS-LSGGQKQRIAIARALLREPKILLLDEATSSLDSNSEK 
GTGGVTLSGGQQQRVAIARAFIRDTPILFXDEAVSALDIVRRN 
RIDTTLLSGGQAQRLCIARALLRKSKILILDECTSALDSVSSS 
GEQGAGLSGGQRQRIAIARALVNNPKILIFDEATSALDYASEH 
PGRVKGLSGGERKRLAFASEALTDPPLLICDEPTSGLDSFTAH 
FEYPAQLSGGQKQRVALARS LAIQP DLLL-DEPFGALDGELRR 
GRS TNQLSGGEWQRVRIjAAWLQ I T LLLLDEPMNS LDVAQQSA 
HQSGYSLSGGQQCRLCIARG IAIRPEVLLLDEPCSALDPI STG 
GKYPVHLSGGQQCRVSIARALAMEPDVLLFDEPTSALDPELVG 

drkpkalsggqrqrvai grt lvaep svfllde pls nl daalrv 
kmyphefsggmrqrVmiamallcrpklliaoepttaldvtvqa 

NRYPHEFSGGOCQRIGIARALILEPKLI ICDDAVSALDVS IQA 
DKLV3DLSIGDQQMVEIAKVLSFESKVIIMDEPTCALIDTETE 
EQAIGLLSGGNQQKVAI ARGLMTRPKVLI LDEP TPGVDVGAKK 
GQSATTLSGGEAQRVKLARELSKRGLYILDEPTTGLHFADIQQ 
NTRVADLSGGMKRRLTLAGAilNDPQLLILDEPTTGLDPHARH 
KN FP I QLSCGEQQRVG I ARAWN KP AVL LADE PTGNLDD ALS E 



Fig. 6. Alignment of the three most conserved segments of the amino acid 
sequences (single lcrrcr code) of the extended NBFs of CFTR with 
comparable regions of other proteins. These three segments consist of 
residues 433 to 473, 488 to 513, and 542 to 584 of the amino- terminal (N) 
half and 1219 to 1259, 1277 to 1302, and 1340 to 1382 of the carboxyl- 
tcrminal (C) half of CFTR. The heavy overlining points out the regions of 
greatest similarity. The star indicates the position corresponding to the 
phenylalanine that is deleted in CF. Additional general homology can be seen 
even with the introduction of very few gaps. The other sequences arc of 
proteins involved in multidrug resistance in human (hmdrl), mouse (mmdr 
1 and 2) (16), and Plasmodium falciparum (pfmdr) (38); the a-factor phcro- 
mone export system of yeast (STE6) (39) ; the hemolysin (hlyB) system of E. 



coli (22); screening of eye pigments in Drosophila (White) (23); an unknown 
liverwort chloroplast function (Mbpx) (25); vitamin B12 transport in E. coti 
(BtuD) (24); phosphate transport in E. coli (PstB) (40); hisridinc transport in 
Salmonella typhimurium (hisP) (41); maltose transport in E. coli (maJK) (42); 
oligopeptide transport in S. typhimurium (oppD and oppF) (43); ribosc 
transport in E. coli (RbsA) (44). UvrA is one component of an E. coli DNA 
repair system (45); Nodi is a gene product involved in nodulation in 
Rhizobium (46); FtsE is a protein that contributes to the regulation of cell 
division (47). In addition to these proteins that contain this long NBF, there 
is a large number of others that contain the two short nucleotide binding 
motifs A and B initially pointed out by Walker et at. (48). Further, there arc 
other proteins containing only motif A or B (49). 
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putative CF gene product the cystic fibrosis transmembrane conduc- 
tance regulator (CFTR). 

Each of the predicted membrane-associated regions of CFTR 
consists of six hydrophobic segments capable of spanning a lipid 
bilayer (19), which are followed by a large hydrophilic region 
containing the NBF's (Fig. 5). On the basis of sequence alignment 
with other nudeotide-binding proteins, each of the putative NBFs 
in CFTR comprises at least 150 residues (Fig. 6). The single residue 
deletion (AF 508 ) detected in most of the CF patients is in the first 
NBF, between the two most highly conserved segments within this 
sequence. The amino acid sequence identity between the region 
surrounding the AF 50 s mutation and the corresponding regions of 
several other proteins suggests that this region is of functional 
importance (Fig. 6). A hydrophobic amino acid, usually one with an 
aromatic side chain, is present in most of these proteins at the 
position corresponding to Phe 508 of CFTR. 

Despite the overall symmetry in the two-motif structure of the 
protein and the sequence conservation of the NBF's, sequence 
identity between the two motifs of the predicted CFTR protein is 
modest. The strongest identity is between sequences at the carboxyl 
ends of the NBFs. Of the 66 residues aligned within these regions, 
27 percent are identical and 1 1 percent are functionally similar. The 
overall, weak internal sequence identity is in contrast to the much 
higher degree (>70 percent) in P-glycoprotein for which a sequence 
duplication hypothesis has been proposed (76). The lack of conser- 
vation in the relative positions of the exon-intron boundaries in the 
CF gene also argues against recent exon duplication as a mechanism 
in the evolution of this gene (Fig. 2). 

Since there is apparently no signal-peptide sequence at the amino 
terminus of CFTR (Fig. 7), the highly charged hydrophilic segment 
preceding the first transmembrane sequence is probably oriented in 
the cytoplasm. Each of the two sets of hydrophobic helices are 
expected to form three traversing loops across the membrane and 
little of the sequence of the entire protein is expected to be exposed 
to the exterior surface, except the region between transmembrane 
segments 7 and 8. It is of interest that the latter region contains two 
potential sites for N-linked glycosylation (20). 

A highly charged cytoplasmic domain can be identified in the 
middle of the predicted CFTR polypeptide, linking the two halves 
of the protein. This domain, named the R domain, is operationally 
defined by a single large exon in which 69 of the 241 amino acids are 
polar residues arranged in alternating clusters of positive and 
negative charges. Moreover, nine of the ten sites at which there are 
consensus sequences for phosphorylation by protein kinase A and 
seven of the potential substrate sites for protein kinase C found in 
CFTR are located in this exon (21). 

Properties of CFTR could be further derived from comparison to 
other membrane-associated proteins (Fig. 6). In addition to the 
overall structural similarity with P-glycoproteins, each of the two 
predicted motifs in CFTR shows resemblance to the single motif 
structure of hemolysin B of Escherichia coli (22) and the product of 
the White gene of Drosophita (23). These proteins are involved in the 
transport of the lytic peptide of the hemolysin system and of eye 
pigment molecules, respectively. The vitamin B12 transport system 
of E. coli, BtuD (24), and MbpX (25), which is a liverwort 
chloroplast gene product whose function is unknown, also have a 
similar structural motif. Further, CFTR shares structural similarity 
with several of the pcriplasmk solute transport systems of Gram- 
negative bacteria, where the transmembrane region and the ATP- 
binding folds are contained in separate proteins that function in 
concert with a third substrate-binding polypeptide (26). 

The overall structural arrangement of the transmembrane do- 
mains in CFTR is similar to several cation channel proteins (27) and 
some cation-translocating adenosine triphosphatases (ATPases) (28) 
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Fig. 7. Schematic model of the predicted 
CFTR protein. The six membrane- spanning 
helices in each half of the molecule arc 
depicted as cylinders. The cytoplasmically 
oriented NBFs are shown as hatched 
spheres with slots ro indicate the means of 
entry by the nudeotide. The large polar R 
domain, which links the two halves, is repre- 
sented by a stippled sphere. Charged indi- 
vidual amino acids are shown as small circles 
containing the charge sign. Net charges on 
the internal and external loops joining the 
membrane cylinders and on regions of the NBFs are contained in open 
squares. Potential sites for phosphorylation by protein kinases A or C (PKA 
or PKC) and N-giycosylation (N-linkcd CHO) are as indicated. K, Lys; R, 
Arg; H, His; D, Asp; and E, Glu. 



as well as the recently described adenylate cyclase of bovine brain 
(29). Short regions of sequence identity have also been detected 
between the putative transmembrane regions of CFTR and other 
membrane-spanning proteins (30). In addition, a sequence of 18 
amino acids situated approximately 50 residues from the carboxyl 
terminus of CFTR shows some identity (12/18) with the raf serine- 
threonine kinase proto-oncogene product of Xenopus heuis (31). 

Finally, a sequence identity (10 of 13 amino acid residues) has 
been noted between a hydrophilic segment (position 701 to 713) 
within the highly charged R domain of CFTR and a region 
immediately preceding the first transmembrane loop of the sodium 
channels in both rat brain and eel (32). This feature of CFTR is not 
shared with the topologically closely related P-glycoprotein; the 
241-amino acid linking peptide is apparendy the major difference 
between the two proteins. 

Relevance to the CF anion transport defect. In view of the 
genetic data of Kerem et al. (9) and the tissue specificity and 
predicted properties of the CFTR protein, it is reasonable to 
conclude that CFTR is directly responsible for CF. It remains 
unclear, however, how CFTR is involved in the regulation of ion 
conductance across the apical membrane of epithelial cells. 

It is possible that CFTR serves as an ion channel itself. For 
example, 10 of the 12 putative transmembrane regions contain one 
or more amino acids with charged side chains (Fig. 7), a property 
similar to that of the brain sodium channel and the 7-aminobutyric 
acid (GABA) receptor chloride channel subunits, where charged 
residues are present in four of the six, and three of the four, 
respective membrane-associated domains per subunit or repeat unit 
(32, 33). The amphiparhic nature of these transmembrane segments 
is believed to contribute to the channel-forming capacity of these 
molecules. In contrast, the closely related P-glycoprotein, which is 
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not believed to conduct ions, has only two charged residues in all 12 
transmembrane domains. Alternatively, CFTR may not be an ion 
channel but instead it may serve to regulate ion channel activities. In 
support of the latter possibility, none of the recently purified 
polypeptides (from trachea and kidney) that are capable of reconsti- 
tuting chloride channels in lipid membranes (6) appear to be CFTR, 
judged on the basis of molecular mass. 

In any case, the presence of ATP-binding domains in CFTR 
suggests that ATP hydrolysis is directly involved and required for 
the transport function. The high density of phosphorylation sites for 
protein kinases A and C and the clusters of charged residues in the R 
domain may both serve to regulate this activity. The deletion of 
Phe 508 in the NBF may prevent proper binding of ATP or the 
conformational change required for normal CFTR activity, conse- 
quently resulting in the observed insensitivity to activation by 
protein kinase A- or protein kinase G-mediated phosphorylation of 
the CF apical chloride conductance pathway (5). Since the predicted 
structure of CFTR contains several conserved domains and belongs 
to a family of proteins, most of which function as parts of 
multicomponent molecular systems (15), the CFTR protein may 
also participate in epithelial cell functions not related to ion 
transport. 

To understand the basic defect in CF, it is necessary to determine 
the precise role of Phe 508 in the regulation of ion transport and to 
understand the mechanism that leads to the pathophysiology of the 
disease. With the CF gene (that is, the cDNA) now isolated, it 
should be possible to elucidate the control of ion transport pathways 
in epithelial cells in general. Knowledge gained from study of the CF 
gene product (CFTR), both the normal and mutant forms, will 
provide a molecular basis for the development of improved means of 
treatment of the disease. 
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